July 3, 2010

High volume, fast product data importing in ASPDotNetStorefront

Filed under: Tips — Tags: , , — Webopius @ 11:29 am

If you have up to a few thousand products and variants in your ASPDotNetStorefront store then this article may not be of interest to you. The ASPDotNetStorefront ecommerce platform comes pre-configured with Excel and XML-based import routines along with the WSI interface.

Once you get beyond a few thousand products, you may find yourself looking for a faster method of importing products into your store. This is almost certainly going to be the case if you regularly need to update your products along with pricing and stock availability.

High volume product import for ASPDotNetStorefront

Webopius have recently developed a high-volume ASPDotNetStorefront import module for a couple of clients. In one case, the client have in excess of 100,000 products that they need to update daily. The updating involved creating new products, updating existing products, handling variants, category mapping, pricing and inventory – a lot to do quickly.

Using the standard ASPDotNetStorefront import methods on this number of products took hours to run. Even our initial attempt to develop a module resulted in a processing speed of approximately 1,000 products a minute. Fast, but not fast enough.

By making some enhancements to the code, at the time of writing this article we now able to import and update between approximately 10,000 and 15,000 products per minute. [Note: Of course, performance will vary depending on many factors and each installation is different]

Performance tuning the ASPDotNetStorefront product import

The high-level steps we took to create a fast product importer are listed below. If you would like Webopius to create a custom product import system for your ASPDotNetStorefront site, then please contact us.

1. Use XML as the source format

The first step we took was to get the source data into an XML format. XML is the ideal language of choice for ASPDotNetStorefront, SQL Server and ASP.Net. The resulting files can be parsed and passed quickly between the various site components.

2. Use XMLTextReader to parse and read the source file

If you’ve ever looked through Microsoft’s XML documentation, you’ll see there are a myriad of ways to read and write XML – XMLReader, XMLTextReader, XPathNavigator, XmlDocument …

XMLTextReader is ideal for processing large XML document streams. It reads forward through the file without doing a great deal of validation or using XPath/XSLT. Basically, it’s quick.

3. Process the data in batches

Rather than processing the data a product at a time, you should process blocks of products in one go. In our case, we read 1,000 products, then package them up in a single call to the database.

If you don’t do this, every product incurs one or more calls to the database and slows the processing down significantly.

4. Move all processing to the database

Of all the performance tuning, this is the one for focus your effort on. Move as much of your data checks and processing to SQL Server as possible.

SQL Server is a data crunching powerhouse. Why do product SKU lookups or Category mapping in your C# or VB code when calling the database can do the same job hundreds of times faster?

Originally, our code logic looked something like this:

– Take product SKU from XML, call the database to get a Product ID
– If ProductID was 0 (SKU not found), then this is a new Product, DB insert needed, otherwise DB Update needed
– If insert performed, lookup the new ProductID for the Category mapping
– For each category mapping in the XML source file…
– Look for an existing category mapping, create one if none exists
– Delete any mappings that aren’t in the XML source file.

A sensible approach but crazy when SQL Server is sitting around waiting to do the job for you. We moved as much of the ASPDotNetStorefront import processing into SQL Server stored procedures as possible. We pass the batch of 1,000 products in XML format to a custom written stored procedure that create/update products and variants and a second procedure to handle the category mappings. This second category mapping procedure was the most complex to write as it has to handle deleting category mappings accurately and quickly.

The new code logic looks something like this

– Pass 1,000 Products in XML format to SQL Server
– [In Database], Move the XML source file into a temporary database table
– [In Database], Lookup the ProductIDs based on the XML SKU field value
– [In Database], Any non-matching SKUs will have a Product ID of 0, insert new products and variants for these
– [In Database], For the other non-zero Product IDs, update products and variants
– Pass Category mappings for 1,000 Products in XML format to SQL Server
– [In Database], Create new Product Category mappings based on XML file
– [In Database], Remove any Product Category mappings for the products in XML where category IDs don’t exist in the source file.

5. Optimize your ASPDotNetStorefront SQL Server indexes

Finally, we took each of the key SQL Select statements used in the above processing and ran them through an Execution Plan in SQL Server Management Studio.
This highlighted a number of areas where additional indexes could be created to increase performance – particularly in the SKU lookup step.

High volume, fast product data importing in ASPDotNetStorefront

If you would like a custom product importer written for your ASPDotNetStorefront site, then please get in touch. In addition to the module described above, other examples of import/export modules Webopius have written include:

  • Excel-based product importer supporting multiple languages
  • XML product importer to enhance product listings with data from ICECAT.biz
  • Excel importer with custom field support such as Author, Account codes, Publisher etc.
  • XML export routine to auto notify warehouse provider of new orders, also updates ASPDNSF shipping dates, tracking details

Other articles about ASPDotNetStorefront

  • Tags