PHP Development
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me
Go Back   Dev Shed ForumsProgramming LanguagesPHP Development
The ASP Free website provides in-depth information on the latest developer tools available from Microsoft. Our cadre of writers, highly experienced industry experts, reveals the best ways to use established technologies as well as new and emerging technologies. Our coverage of Microsoft's development and administration technologies is among the most respected in the IT industry today.

ASP Free and Iron Speed Designer are giving away $5,500+ in FREE licenses. Iron Speed's RAD CASE toolset can save up to 80% of your coding time. One free license per week, one perpetual license per month!
Download and Activate to enter!

Intel® Graphics Performance Analyzers is a powerful tool suite for analyzing and optimizing your games, media, and graphics-intensive applications. Used by some of the best developers on the planet, Intel GPA lets you maximize your app’s performance.


Tutorials
| Forums

Download to Enter
| Contest Rules

DOWNLOAD INTEL® GPA FOR FREE

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
  #1  
Old March 11th, 2010, 01:32 AM
marvinzzz marvinzzz is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Mar 2010
Posts: 3 marvinzzz User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 8 m 34 sec
Reputation Power: 0
PHP-DB - Data aggregation, scripting & MySQL

Hello PHP pros!

Pardon my ignorance, I'm an utter newb!
So I'm building an aggregator website and I was wanted to ask some questions about the process, logic and workflow behind maintaining the data.

I have around 20 different products from different suppliers that are in PDF or XLS format. Each supplier describes their product information in a different way so this means that I obviously can't just extract product data using ONE mechanism.. I need 20. Further, product data changes once every while, so I don't want to manually have to maintain the database. I need a system that can be run to extract data from PDF/XLS sources to reduce overhead.

So I have the following questions please:
  • What is the 'best' way to go about my problem: extracting data from raw sources, converting it into a standard form, inserting into MySQL, scripting the automation process, etc? Can you explain broadly the basic steps required to do what I need?
  • Should I look to convert the PDF/XLS files into HTML/CSV/Text files/some other format? What is the 'best' format to convert my source documents into assuming that my website is in PHP/MySQL and why?

Any ideas/suggestions? Thanks!

Reply With Quote
  #2  
Old March 11th, 2010, 03:14 AM
Northie's Avatar
Northie Northie is offline
Square Peg in a Round Hole
Dev Shed Frequenter (2500 - 2999 posts)
 
Join Date: Oct 2007
Location: North Yorkshire, UK
Posts: 2,809 Northie User rank is General 33rd Grade (Above 100000 Reputation Level)Northie User rank is General 33rd Grade (Above 100000 Reputation Level)Northie User rank is General 33rd Grade (Above 100000 Reputation Level)Northie User rank is General 33rd Grade (Above 100000 Reputation Level)Northie User rank is General 33rd Grade (Above 100000 Reputation Level)Northie User rank is General 33rd Grade (Above 100000 Reputation Level)Northie User rank is General 33rd Grade (Above 100000 Reputation Level)Northie User rank is General 33rd Grade (Above 100000 Reputation Level)Northie User rank is General 33rd Grade (Above 100000 Reputation Level)Northie User rank is General 33rd Grade (Above 100000 Reputation Level)Northie User rank is General 33rd Grade (Above 100000 Reputation Level)Northie User rank is General 33rd Grade (Above 100000 Reputation Level)Northie User rank is General 33rd Grade (Above 100000 Reputation Level)Northie User rank is General 33rd Grade (Above 100000 Reputation Level)Northie User rank is General 33rd Grade (Above 100000 Reputation Level)Northie User rank is General 33rd Grade (Above 100000 Reputation Level) 
Time spent in forums: 3 Weeks 22 h 48 m 15 sec
Reputation Power: 3177
My first step is to design my database

Although the database shcema tends to evolve from it's initial design, I try to keep two things in mind

1 - How the data gets into the database
2 - How the data gets out of the database

For most web-based cases the data is read out much more than it's read in, so the design favours reading, rather than writing if ever a compromise is needed.

Once you know what data you want to store, you can start building scripts to extract the relevant data from your sources.

Search around for scripts that will read your specific data types - i find PDFs a nightmare, and csv is a lot easier than xls.

You'll be looking at converting your documents to plain text, then extracting data for MySQL.

Your "common format" will be the data in your database
__________________
PHP OOPS! <?php DB::Execute(SQL::makeFrom($_GET))->fetchArray()->FormatWith(Template::getInstance('default'))->printHtml(); ?>
An Introduction To Object Oriented Programming in PHP

Job Opportunity :: I am Hiring
[ Xeneco - T'interweb Development ] - [ Are you a Help Vampire? ] - [ Read The manual! ] - [ W3 methods - GET, POST, etc ] - [ Web Design Hell ]

Reply With Quote
Reply

Viewing: Dev Shed ForumsProgramming LanguagesPHP Development > PHP-DB - Data aggregation, scripting & MySQL


Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump

Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 


Powered by: vBulletin Version 3.0.5
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.

© 2003-2012 by Developer Shed. All rights reserved. DS Cluster 4 - Follow our Sitemap