Firebird SQL Development
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me
Go Back   Dev Shed ForumsDatabasesFirebird SQL Development

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
  #1  
Old October 28th, 2004, 12:21 PM
HereSomeHow HereSomeHow is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Oct 2004
Location: Guadalajara, Jalisco, México
Posts: 62 HereSomeHow User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 17 h 51 m 29 sec
Reputation Power: 4
What to expect when Pumping Data?

Hi, I'm making my first attempt to develop an application that manages >1Gb of data.

I've allways worked with little databases in the MS-JET format (less than 400MB, around the 2 millions of records for one table), and now I've recieved a promotion to administer a new database (and produce apps to exploit it) that would have tables in the 15-20 millions of records.

My question is, I'm used to "update" the data in lapses of few minutes, using direct VBA programming, even inside Excel, and those times are not scaling accordingly with Firebird. I know this is a completely different plane of existance, so I'm asking the experts here.

What I'm trying todo is the following: I'm gonna manage the customers database for a bank, with tables describing the branches and their organization (900 branches, grouped in 48 zones, grouped in 9 regions) and a big table consisting in 5 fields: Key, Date, Product, Average and Remanent, to store a shot of the deposits of all the customers the bank has.

So this table is gonna contain 24 shots (2 years of history) with each shot approx. of 5 million records. To fill the table I'll get a CSV file of about 400Mb each week. So far I've tried with MS-Access 2003, linking the CSV file and the FB Table via ODBC, and surely it works, but takes 3 hours to feed the data, in a local server.

I'm using a P4-2.8Ghz 600~MB RAM SATA Disk, XPpro HT enabled. I know this is not the bes routet to take, but I need an orientation of how long should it take in a well developed process, so to know what to expect, and when to stop investing time trying to improve the process.

Thanks in advance.

Reply With Quote
  #3  
Old October 29th, 2004, 03:21 AM
fikret fikret is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Aug 2004
Location: Sarajevo, Bosnia
Posts: 111 fikret User rank is Corporal (100 - 500 Reputation Level)fikret User rank is Corporal (100 - 500 Reputation Level)fikret User rank is Corporal (100 - 500 Reputation Level)fikret User rank is Corporal (100 - 500 Reputation Level) 
Time spent in forums: 1 Day 5 h 17 m 43 sec
Reputation Power: 8
I have modified Microsoft Access and SQL Server to InterBase Wizard (http://www.ibphoenix.com/main.nfs?a...51&page=sql2gdb) so it's now able to process UNICODE data. Mail me if you need that. And, using this tool, I was able to pump 8 Gb of data from MSSQL2000 to FB1.5 in 4-5 hours.
Also, see InterBase DataPump, and I would recommend this tool over sql2gdb (it better handles BLOB's).


--
Best regards,
Fikret Hasovic http://fikret.fbtalk.net
TAMP R&D Team
FirebirdSQL Foundation member.
- Join today at http://www.firebirdsql.org/ff/foundation
JEDI VCS contributor
http://jedivcs.sourceforge.net/

Reply With Quote
  #4  
Old October 31st, 2004, 03:37 PM
skidder skidder is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Aug 2003
Posts: 15 skidder User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 0
Send a message via ICQ to skidder
With well-developed pump process Firebird can inject a few hundred MB of data per CPU per minute.

For example, you can convert your text data into fixed-width representation which may be directly attached to Firebird database using CREATE TABLE ... EXTERNAL ... mechanizm and insert data into normal table using INSERT statement or stored procedure.

If you have really fast IO subsystem and SMP box you can further improve performance via running serveral such INSERT statements in parallel on Firebird/Classic configuration. Try disabling forced writes for the pump process. Also, almost any UNIX tends to be much faster at bulk operations due to more intellegent filesystem cache and IO queue management.

Reply With Quote
  #5  
Old November 1st, 2004, 09:44 AM
HereSomeHow HereSomeHow is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Oct 2004
Location: Guadalajara, Jalisco, México
Posts: 62 HereSomeHow User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 17 h 51 m 29 sec
Reputation Power: 4
Thanks to all, I'll start trying the DataPump first, then the externally attached file. The only problem with it is that the file I get is a varible lenght one, sepparated with "|", soI'll have to massage it first to convert it to fixe lenght. But I'll try all the different approaches to see what fills the bill better.

Reply With Quote
  #6  
Old November 1st, 2004, 08:07 PM
HereSomeHow HereSomeHow is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Oct 2004
Location: Guadalajara, Jalisco, México
Posts: 62 HereSomeHow User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 17 h 51 m 29 sec
Reputation Power: 4
I'm feeling a little too dumb...

I already made a little app in Delphi to transform my delimited txt to a fixed lenght one. But now when I try to CREATE TABLE... EXTERNAL FILE, I get an SQL error saying that "Access to external file ... is denied by server administrator", the TXT is not opened by any other program, what I'm doing wrong? Thanks...

Reply With Quote
  #7  
Old November 1st, 2004, 11:09 PM
skidder skidder is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Aug 2003
Posts: 15 skidder User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 0
Send a message via ICQ to skidder
Go to firebird.conf and adjust ExternalFileAccess option appropriately.

Reply With Quote
  #8  
Old November 2nd, 2004, 04:44 PM
HereSomeHow HereSomeHow is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Oct 2004
Location: Guadalajara, Jalisco, México
Posts: 62 HereSomeHow User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 17 h 51 m 29 sec
Reputation Power: 4
Quote:
Originally Posted by skidder
Go to firebird.conf and adjust ExternalFileAccess option appropriately.

Great I have now completed a process to Import the text file into FireBird, using the external table. The full process takes about 35 minutes (down from 3 hours). Now I'm having another problem:

The file I'm translating has one field with 2 decimals, containig the average a client has in one time period. I made Delphi SUM the field while parsing the original file, and I get 79,690,952,644.34 in my process, now if I do a SELECT SUM... in the final table i get 79,690,969,711.864. I defined the field as DOUBLE PRECISION, am I mistaken? should I declare them DECIMAL(15,2) or something?

Reply With Quote
  #9  
Old October 5th, 2005, 07:34 AM
rik77 rik77 is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Oct 2005
Posts: 2 rik77 User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 17 m 47 sec
Reputation Power: 0
Precision

Hey HereSomeHow,

The general rule with currency is that you calculate with exact numbers whenever you can. That means using "exact" data types, depending on your platform (sorry I haven't used Delphi or Firebird) they are usually called Decimal or Numeric.

These exact types can still yield approximate results if some operation results in a repeating decimal component which goes beyond the precision of your decimal variables. These imprecisions can build up if there is a sequence of operations that result in rounding. Using Double or Float is also good if the absolute size of the number is not too huge. If the absolute size is too massive, you will lose precision at the small scale of the number.

You should go over the operations that are conducted on your numbers and see where imprecisions might be introduced (diving by small numbers etc. ) to try to account for the discrepancy.

Reply With Quote
  #10  
Old October 5th, 2005, 07:40 AM
rik77 rik77 is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Oct 2005
Posts: 2 rik77 User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 17 m 47 sec
Reputation Power: 0
Hey again, sorry I've had too many beers, I missed the obvious answer.

Yes if you can make the resultant field a (15,2) decimal that should be completely lossless with SUM operation over a set of (x<<15,2) decimals. Floats should also be OK in that circumstance, but with Decimal you know it's exact.

Reply With Quote
Reply

Viewing: Dev Shed ForumsDatabasesFirebird SQL Development > What to expect when Pumping Data?


Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump


Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 





© 2003-2008 by Developer Shed. All rights reserved. DS Cluster 2 hosted by Hostway