Sorting has been a fundamental operation in data processing since the
days of punched cards.
Some of the more important applications of sorting are:
- ranking or ordering data
- arranging data so that it can be efficiently searched
- bringing together records with the same key
Nsort is the fastest sort program in the world.
Nsort can help any application that needs to sort large files of
data quickly.
Nsort is used today by customers in diverse applications, such as
an advertising management company which uses Nsort to analyze and
and interpret gigabytes of web traffic per day.
If you have a time-critical sort application, Nsort can help improve
your application performance.
A large portion of the world's data is stored in databases or
data warehouses.
Nsort can help preprocess data for loading into a database or
data warehouse, so that the data is loaded as quickly as possible.
Nsort is currently being used by a well-known internet service provider to
prepare load data and compute aggregates for their data warehouse system.
Nsort can help you use your load window more efficiently.
Presorting your data in advance of the load window can reduce the
amount of work your database or data warehouse system needs to do
during the load window.
By letting Nsort presort your load data,
you can reduce the overall time that it takes to perform your data loads.
Nsort can also help make more efficient use of
your hardware. With Nsort,
you can get your data loaded more quickly without having to purchase
more processors or memory.
Presorting data before loading it into your database
or data warehouse drastically reduces the time required to create the primary
(main, master, or clustered) index for the table.
The fastest way to load a new table into a database or data warehouse is:
- Sort the data to be loaded by the key(s) of the table index.
- Load the data.
- Create the primary index with the "already sorted" option.
In the absence of the "already sorted" option, the database system must
sort the data in order to create the index.
This database sort usually takes longer than the actual data load.
Since Nsort sorts data very quickly (sort rates of 100 megabytes per second have been observed), presorting the load data with Nsort greatly reduces the overall time to load new data and create the primary index.
In some cases presorting can also speed up the incremental loads of daily
or weekly additions to tables that already exist.
Some data warehouses allow part of a table to be offline or
disabled during the load.
If the newly loaded data is not already sorted, the data warehouse
must sort the new data in order to create the new portion of the table index.
Nsort's superior sort speed saves time by eliminating
the data warehouse's sort of the incremental data.
Incoming load data can contain bad records or data that is not germane to your application. Nsort can help by selecting just those records you want to load into your database. Nsort can perform range checking and make sure that required fields are present. Nsort can even provide default values for fields that are absent or out-of-range. Nsort can check for data integrity, isolating the records that fail integrity constraints into reject files. Nsort can also reformat records to add, remove,or rearrange fields.
With Nsort's help, you can streamline the load process by making sure that your RDBMS never sees bad or irrelevant records.
Aggregated data forms the basis for most data warehouse systems. As Dr. Ralph Kimball stresses in The Data Warehouse Toolkit : "The use of prestored summaries (aggregates) is the single most effective tool the data warehouse designer has to control performance."
Prestored aggregates of a fact table can either be computed by the data warehouse, or externally computed and loaded into the data warehouse. Nsort's summarize feature allows aggregates to be quickly computed during the course of a sort. Nsort's extremely fast speed makes for a quicker overall process of generating prestored aggregates.
Nsort can also help you create fact table records that represent aggregated totals. For example the fact table records might represent the total sales of a given product in a store on a particular day. Nsort can summarize records containing individual sales transactions to produce the total sales by each unique product, store, day combination.
|