PHP Development
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me
Go Back   Dev Shed ForumsProgramming LanguagesPHP Development

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
Stop making mediocre tutorials.The best tutorials are video! Camtasia Studio makes it easy to create engaging, buzz-building screen videos at any size, in any popular format. Download the free trial!
  #1  
Old May 7th, 2008, 10:58 AM
Nutopia Nutopia is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jun 2005
Posts: 84 Nutopia User rank is Private First Class (20 - 50 Reputation Level)Nutopia User rank is Private First Class (20 - 50 Reputation Level) 
Time spent in forums: 23 h 2 m 2 sec
Reputation Power: 3
Other - Issue with large files

Hi. I have sort of a complicated issue I've been trying to solve for the past week. Hoping someone might have an idea or feedback.

I've got two servers - a web server(apache) and a DB server (mysql). I need to create a very large csv file from the DB and make it available for download. The process will be executed via cron at night due to the amount of resources this will consume. The raw csv file can easily be over 70MB and grow larger.

So there are multiple ways to do this, each with their own advantages and disadvantages.

Option #1 is to perform a SELECT statement, pull the result set from mysql into PHP and use PHP to loop through the result set and create a CSV file. That works great for small files, but when we're talking about large ones I'm hitting php memory limits and script time outs. This could be due to a poorly written PHP loop - but my gut tells me that a file this large will have problems no matter what.
The advantage of this is that the file ends up on my web server where I can save it and make it available for download. Disadvantage is that PHP may not be able to handle the process without outrageously high memory limits and execution time limits.

Option #2 is to perform INTO OUTFILE and have mysql create the csv file. This then saves the file on my DB server. I need to get the file onto my web server. That would mean I'd have to FTP a >70MB file, and there could be a number of files (not just one.) I'd like to zip the file on the DB server first to save some transfer time, but I'm not sure if that's possible.
I've also run into some issues getting the transfer to work - I think they may have to do with file permissions but I'm not sure.

I know I haven't posted any code or examples yet, and I'll gladly do that if anyone out there wants to help out. If anyone thinks one option is better than another, or has a different idea, I'm all ears. If the whole thing is just too big, I'll have to just end up limiting the file sizes - but I'd like to avoid that if possible.

Thanks to anyone who may have some feedback!

This guy is pretty appropriate right now for me:
__________________
No trees were hurt in the sending of this message, however, a large number of electrons were terribly inconvenienced.

Reply With Quote
  #2  
Old May 7th, 2008, 11:05 AM
thedude2010 thedude2010 is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Apr 2005
Posts: 247 thedude2010 User rank is Sergeant Major (2000 - 5000 Reputation Level)thedude2010 User rank is Sergeant Major (2000 - 5000 Reputation Level)thedude2010 User rank is Sergeant Major (2000 - 5000 Reputation Level)thedude2010 User rank is Sergeant Major (2000 - 5000 Reputation Level)thedude2010 User rank is Sergeant Major (2000 - 5000 Reputation Level)thedude2010 User rank is Sergeant Major (2000 - 5000 Reputation Level) 
Time spent in forums: 3 Days 23 h 35 m 10 sec
Reputation Power: 33
you could have php run the select into outfile. Then have your cron tar the resulting file(s).

Quote:
Originally Posted by Nutopia
Hi. I have sort of a complicated issue I've been trying to solve for the past week. Hoping someone might have an idea or feedback.

I've got two servers - a web server(apache) and a DB server (mysql). I need to create a very large csv file from the DB and make it available for download. The process will be executed via cron at night due to the amount of resources this will consume. The raw csv file can easily be over 70MB and grow larger.

So there are multiple ways to do this, each with their own advantages and disadvantages.

Option #1 is to perform a SELECT statement, pull the result set from mysql into PHP and use PHP to loop through the result set and create a CSV file. That works great for small files, but when we're talking about large ones I'm hitting php memory limits and script time outs. This could be due to a poorly written PHP loop - but my gut tells me that a file this large will have problems no matter what.
The advantage of this is that the file ends up on my web server where I can save it and make it available for download. Disadvantage is that PHP may not be able to handle the process without outrageously high memory limits and execution time limits.

Option #2 is to perform INTO OUTFILE and have mysql create the csv file. This then saves the file on my DB server. I need to get the file onto my web server. That would mean I'd have to FTP a >70MB file, and there could be a number of files (not just one.) I'd like to zip the file on the DB server first to save some transfer time, but I'm not sure if that's possible.
I've also run into some issues getting the transfer to work - I think they may have to do with file permissions but I'm not sure.

I know I haven't posted any code or examples yet, and I'll gladly do that if anyone out there wants to help out. If anyone thinks one option is better than another, or has a different idea, I'm all ears. If the whole thing is just too big, I'll have to just end up limiting the file sizes - but I'd like to avoid that if possible.

Thanks to anyone who may have some feedback!

This guy is pretty appropriate right now for me:

Reply With Quote
  #3  
Old May 7th, 2008, 03:21 PM
requinix's Avatar
requinix requinix is offline
i am not a golden god :(
Click here for more information.
 
Join Date: Mar 2007
Location: Here
Posts: 2,080 requinix User rank is Major (30000 - 40000 Reputation Level)requinix User rank is Major (30000 - 40000 Reputation Level)requinix User rank is Major (30000 - 40000 Reputation Level)requinix User rank is Major (30000 - 40000 Reputation Level)requinix User rank is Major (30000 - 40000 Reputation Level)requinix User rank is Major (30000 - 40000 Reputation Level)requinix User rank is Major (30000 - 40000 Reputation Level)requinix User rank is Major (30000 - 40000 Reputation Level)requinix User rank is Major (30000 - 40000 Reputation Level)requinix User rank is Major (30000 - 40000 Reputation Level) 
Time spent in forums: 1 Month 2 Weeks 5 Days 3 h 30 m 55 sec
Reputation Power: 311
Send a message via MSN to requinix
It isn't PHP that runs the query, it's MySQL. If it runs a SELECT INTO OUTFILE then the file will be located where the MySQL server is, not where PHP is (since it seems there is a difference).

I'd just go for the first option. There shouldn't be much of a memory problem if you write data out to the file whenever possible.
But just in case,
PHP Code:
 set_time_limit(0);
ini_set("memory_limit","-1"); 
__________________
FUMOFFURU!
--
PHP Manual (cURL - PCRE regex - references) error_reporting() SQL injection: and mysql_real_escape_string() PHP and cron

Negative0 Have spare time? Interested in a large-scale freelance PHP project? Chance for a part-time gig? PM me for details.

Reply With Quote
  #4  
Old May 8th, 2008, 11:19 AM
Nutopia Nutopia is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jun 2005
Posts: 84 Nutopia User rank is Private First Class (20 - 50 Reputation Level)Nutopia User rank is Private First Class (20 - 50 Reputation Level) 
Time spent in forums: 23 h 2 m 2 sec
Reputation Power: 3
Quote:
Originally Posted by requinix
It isn't PHP that runs the query, it's MySQL. If it runs a SELECT INTO OUTFILE then the file will be located where the MySQL server is, not where PHP is (since it seems there is a difference).

Here's my problem - when I run MYSQL INTO OUTFILE it saves the csv on the DB server but the file is owned by the mysql user.
Then, when I try to use PHP to ftp_get the file, it's not letting me retrieve the file due to permission issue. Does anyone know how to work around this?


Quote:
Originally Posted by requinix
I'd just go for the first option. There shouldn't be much of a memory problem if you write data out to the file whenever possible.
But just in case,
PHP Code:
 set_time_limit(0);
ini_set("memory_limit","-1"); 


Ok - Maybe I've been doing this wrong. I'm looping through the RS and creating a string which is basically just formatted csv. Then, I write the file from that string variable:

Code:
$file_handle = fopen($filename,"x+");
$bytesWritten = fwrite($file_handle,$outputToFile,5242880);
fclose($file_handle);


Better way?

Last edited by Nutopia : May 8th, 2008 at 11:26 AM.

Reply With Quote
  #5  
Old May 8th, 2008, 11:27 AM
thedude2010 thedude2010 is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Apr 2005
Posts: 247 thedude2010 User rank is Sergeant Major (2000 - 5000 Reputation Level)thedude2010 User rank is Sergeant Major (2000 - 5000 Reputation Level)thedude2010 User rank is Sergeant Major (2000 - 5000 Reputation Level)thedude2010 User rank is Sergeant Major (2000 - 5000 Reputation Level)thedude2010 User rank is Sergeant Major (2000 - 5000 Reputation Level)thedude2010 User rank is Sergeant Major (2000 - 5000 Reputation Level) 
Time spent in forums: 3 Days 23 h 35 m 10 sec
Reputation Power: 33
it creates the file with the user that mysql is run under. So you should create a mysql user on the db with same username as your web server runs under with the appropriate permissions then run the select into outfile with that user.

Quote:
Originally Posted by Nutopia
Here's my problem - when I run MYSQL INTO OUTFILE it saves the csv on the DB server but the file is owned by the mysql user.
Then, when I try to use PHP to ftp_get the file, it's not letting me retrieve the file due to permission issue. Does anyone know how to work around this?

Reply With Quote
  #6  
Old May 8th, 2008, 12:09 PM
kicken's Avatar
kicken kicken is offline
Wiser? Not exactly.
Dev Shed Specialist (4000 - 4499 posts)
 
Join Date: May 2001
Location: Ft Myers, FL
Posts: 4,003 kicken User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)kicken User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)kicken User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)kicken User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)kicken User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)kicken User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)kicken User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)kicken User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)kicken User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)kicken User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)kicken User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)  Folding Points: 115392 Folding Title: Super Ultimate Folder - Level 1Folding Points: 115392 Folding Title: Super Ultimate Folder - Level 1Folding Points: 115392 Folding Title: Super Ultimate Folder - Level 1Folding Points: 115392 Folding Title: Super Ultimate Folder - Level 1Folding Points: 115392 Folding Title: Super Ultimate Folder - Level 1Folding Points: 115392 Folding Title: Super Ultimate Folder - Level 1
Time spent in forums: 1 Month 2 Weeks 4 Days 16 h 44 m 28 sec
Reputation Power: 414
Send a message via ICQ to kicken Send a message via AIM to kicken Send a message via MSN to kicken
Quote:
Originally Posted by Nutopia
Ok - Maybe I've been doing this wrong. I'm looping through the RS and creating a string which is basically just formatted csv. Then, I write the file from that string variable:

Code:
$file_handle = fopen($filename,"x+");
$bytesWritten = fwrite($file_handle,$outputToFile,5242880);
fclose($file_handle);


Better way?


You need to write the file as you go through the records. Otherwise, your just creating a 70+MB string in memory, and obviouslly even if php had no memory limit, you'd hit the limit of your system's ram at some point.

Something like this is what you should do.

PHP Code:
 $fp fopen('out.csv''w');
while (
$row mysql_fetch_row($results)){
  
fwrite($fp'"'.implode('", "'$row).'"');
}
fclose($fp); 


With that method, you only ever have one row of data in memory at a time. The rest is being written to the hard drive out of the way. This way, your only limited to the amount of HD space you have.
__________________
Spidermonkey Tutorial

http://wiser.aoeex.com/ - Long term project (offline due to evil crawlers and lack of content)
http://www.aoeex.com/gmap.php - Put yourself on the map

Reply With Quote
  #7  
Old May 8th, 2008, 12:15 PM
Nutopia Nutopia is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jun 2005
Posts: 84 Nutopia User rank is Private First Class (20 - 50 Reputation Level)Nutopia User rank is Private First Class (20 - 50 Reputation Level) 
Time spent in forums: 23 h 2 m 2 sec
Reputation Power: 3
Thank you all - I understand what everyone is saying and I like it. I'll get to work on doing this. I feel myself becoming a better php programmer - slow and steady and with a little help from my 'friends.' I'll let you know how it turns out.

Reply With Quote
  #8  
Old May 8th, 2008, 02:29 PM
Nutopia Nutopia is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jun 2005
Posts: 84 Nutopia User rank is Private First Class (20 - 50 Reputation Level)Nutopia User rank is Private First Class (20 - 50 Reputation Level) 
Time spent in forums: 23 h 2 m 2 sec
Reputation Power: 3
My solution

Hey, here is the solution that I went with for now - seems to work pretty well! I did not use INTO OUTFILE at this point, but may turn to that in the future depending on how things are running - I wanted to avoid having to FTP the files back and forth. So I'm just taking my result set from mysql and using php to create the csv file:

I'm also using the keys of the result array to create column headers.

Code:
//Create CSV FILE		
$filename = $Name ."_yourExport_" . date("Y-m-d_G:i:s").".csv";
$pathToRoot = "..."
$fp = fopen($pathToRoot."/extracts/".$filename, 'w');
$headerAdded = false;
$header = "";		
foreach($leadsRS as $row){
	if(!$headerAdded) {
		foreach($row as $key => $value) {
		     if(array_key_exists($key,$columnsForHeader)) 
			$header .= $columnsForHeader[$key] . ",";
		     else
			$header .= $key . ",";
		}
	        fputcsv($fp,explode(",",$header));
		$headerAdded = true;
	}
	  fputcsv($fp,$row);
}
fclose($fp);


Thanks again for everyone's help.

Reply With Quote
Reply

Viewing: Dev Shed ForumsProgramming LanguagesPHP Development > Other - Issue with large files


Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump

 Free IT White Papers!
 
Accelerating Trading Partner Performance
One in five. That's how many partner transactions have at least one error. That is an amazing statistic, particularly given the extraordinary leaps in innovation across the global supply chain during the past two decades. Download this white paper to learn more.

 
Competing on Analytics
This Tech Analysis is designed to help identify characteristics shared by analytics competitors, and includes information about 32 organizations that have made a commitment to quantitative, fact-based analysis.

 
Cost Effective Scaling with Virtualization and Coyote Point Systems
An overview of the industry trend toward virtualization, how server consolidation has increased the importance of application uptime and the steps being taken to integrate load balancing technology with virtualized servers.

 
Five Checkpoints to Implementing IP Telephony
Implementation planning for IP PBX software and IP telephony has become vital as businesses replace discontinued legacy PBX phone systems. This informative whitepaper outlines five "checkpoints" for any implementation plan that will help make IP communications a successful proposition.

 
Hosted Email Security: Staying Ahead of New Threats
In the last two years, email has become a fierce battleground between the nefarious forces of spam and malware, and the heroes of messaging protection. The spam volumes increased alarmingly every month, bringing clever new forms of phishing and virus propagation attacks.

 

Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 





© 2003-2008 by Developer Shed. All rights reserved. DS Cluster 4 hosted by Hostway