Discuss Data aggregation, scripting & MySQL in the PHP Development forum on Dev Shed. Data aggregation, scripting & MySQL PHP Development forum discussing coding practices, tips on PHP, and other PHP-related topics. PHP is an open source scripting language that has taken the web development industry by storm.
The ASP Free website provides in-depth information on the latest developer tools available from Microsoft. Our cadre of writers, highly experienced industry experts, reveals the best ways to use established technologies as well as new and emerging technologies. Our coverage of Microsoft's development and administration technologies is among the most respected in the IT industry today.
ASP Free and Iron Speed Designer are giving away $5,500+ in FREE licenses. Iron Speed's RAD CASE toolset can save up to 80% of your coding time. One free license per week, one perpetual license per month! Download and Activate to enter!
Intel® Graphics Performance Analyzers is a powerful tool suite for analyzing and optimizing your games, media, and graphics-intensive applications. Used by some of the best developers on the planet, Intel GPA lets you maximize your app’s performance.
Posts: 3
Time spent in forums: 8 m 34 sec
Reputation Power: 0
PHP-DB - Data aggregation, scripting & MySQL
Hello PHP pros!
Pardon my ignorance, I'm an utter newb!
So I'm building an aggregator website and I was wanted to ask some questions about the process, logic and workflow behind maintaining the data.
I have around 20 different products from different suppliers that are in PDF or XLS format. Each supplier describes their product information in a different way so this means that I obviously can't just extract product data using ONE mechanism.. I need 20. Further, product data changes once every while, so I don't want to manually have to maintain the database. I need a system that can be run to extract data from PDF/XLS sources to reduce overhead.
So I have the following questions please:
What is the 'best' way to go about my problem: extracting data from raw sources, converting it into a standard form, inserting into MySQL, scripting the automation process, etc? Can you explain broadly the basic steps required to do what I need?
Should I look to convert the PDF/XLS files into HTML/CSV/Text files/some other format? What is the 'best' format to convert my source documents into assuming that my website is in PHP/MySQL and why?
Posts: 2,809
Time spent in forums: 3 Weeks 22 h 48 m 15 sec
Reputation Power: 3177
My first step is to design my database
Although the database shcema tends to evolve from it's initial design, I try to keep two things in mind
1 - How the data gets into the database
2 - How the data gets out of the database
For most web-based cases the data is read out much more than it's read in, so the design favours reading, rather than writing if ever a compromise is needed.
Once you know what data you want to store, you can start building scripts to extract the relevant data from your sources.
Search around for scripts that will read your specific data types - i find PDFs a nightmare, and csv is a lot easier than xls.
You'll be looking at converting your documents to plain text, then extracting data for MySQL.
Your "common format" will be the data in your database