C Programming
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me
Go Back   Dev Shed ForumsProgramming LanguagesC Programming

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
Stop making mediocre tutorials.The best tutorials are video! Camtasia Studio makes it easy to create engaging, buzz-building screen videos at any size, in any popular format. Download the free trial!
  #1  
Old February 18th, 2003, 05:15 PM
marron79's Avatar
marron79 marron79 is offline
Rut row Raggy!
Dev Shed Novice (500 - 999 posts)
 
Join Date: Jul 2001
Location: Tornado Alley
Posts: 558 marron79 User rank is Private First Class (20 - 50 Reputation Level)marron79 User rank is Private First Class (20 - 50 Reputation Level) 
Time spent in forums: 9 h 41 sec
Reputation Power: 8
Question Obtaining html files and saving them as text in C or C++

I got Petzold's book today from Amazon.com, and while it's a great book, it spends little time on the internet (even though it spends hundreds of pages on bitmaps), which I thought was odd since the book was printed in '98. Anyway, the program I want to develop needs to open HTML files from a website, edit them (saving only the parts needed), and save them as hidden .dat files. Or I could just read the HTML files. How would I accomplish this in C or C++? Would I use the normal file I/O functions and which is better (saving the file or reading it)?
__________________
Matt

Reply With Quote
  #2  
Old February 18th, 2003, 09:10 PM
balance balance is offline
.
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Dec 2002
Posts: 296 balance User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 6
the code linked to in the 'a very small http server in c' thread might be helpfull. i think fopen will help.

Last edited by balance : February 18th, 2003 at 09:19 PM.

Reply With Quote
  #3  
Old February 18th, 2003, 10:35 PM
marron79's Avatar
marron79 marron79 is offline
Rut row Raggy!
Dev Shed Novice (500 - 999 posts)
 
Join Date: Jul 2001
Location: Tornado Alley
Posts: 558 marron79 User rank is Private First Class (20 - 50 Reputation Level)marron79 User rank is Private First Class (20 - 50 Reputation Level) 
Time spent in forums: 9 h 41 sec
Reputation Power: 8
That's for Unix, I'm making an app for Win32.

Reply With Quote
  #4  
Old February 18th, 2003, 11:51 PM
dwise1_aol's Avatar
dwise1_aol dwise1_aol is offline
Contributing User
Dev Shed Expert (3500 - 3999 posts)
 
Join Date: Jan 2003
Location: USA
Posts: 3,803 dwise1_aol User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)dwise1_aol User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)dwise1_aol User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)dwise1_aol User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)dwise1_aol User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)dwise1_aol User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)dwise1_aol User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)dwise1_aol User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)dwise1_aol User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)dwise1_aol User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)dwise1_aol User rank is Lieutenant Colonel (40000 - 50000 Reputation Level) 
Time spent in forums: 1 Month 11 h 57 m 9 sec
Reputation Power: 437
Quote:
Originally posted by marron79
That's for Unix, I'm making an app for Win32.


Winsock also supports the standard sockets API for the most part. Read "Transitioning from UNIX to Windows Socket Programming" by Paul O'Steen at http://cs.baylor.edu/~donahoo/pract...dowsSockets.pdf for instructions on converting a UNIX sockets program to a Win32 Winsock console application. You can even do multithreading in both UNIX and Win32, though the function names are a bit different. About the only thing you can't do in Win32 is process forking.

But it sounded more like you were talking about parsing HTML. If that's the case, I do believe that there's an MFC view class for HTML.

Reply With Quote
  #5  
Old February 19th, 2003, 02:57 AM
marron79's Avatar
marron79 marron79 is offline
Rut row Raggy!
Dev Shed Novice (500 - 999 posts)
 
Join Date: Jul 2001
Location: Tornado Alley
Posts: 558 marron79 User rank is Private First Class (20 - 50 Reputation Level)marron79 User rank is Private First Class (20 - 50 Reputation Level) 
Time spent in forums: 9 h 41 sec
Reputation Power: 8
Quote:
Originally posted by dwise1_aol
But it sounded more like you were talking about parsing HTML. If that's the case, I do believe that there's an MFC view class for HTML.


Yes, I've seen the MFC thing for HTML views, but I'm using Win32. Is there one for Win32?

Reply With Quote
  #6  
Old February 19th, 2003, 10:03 AM
dwise1_aol's Avatar
dwise1_aol dwise1_aol is offline
Contributing User
Dev Shed Expert (3500 - 3999 posts)
 
Join Date: Jan 2003
Location: USA
Posts: 3,803 dwise1_aol User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)dwise1_aol User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)dwise1_aol User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)dwise1_aol User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)dwise1_aol User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)dwise1_aol User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)dwise1_aol User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)dwise1_aol User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)dwise1_aol User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)dwise1_aol User rank is Lieutenant Colonel (40000 - 50000 Reputation Level)dwise1_aol User rank is Lieutenant Colonel (40000 - 50000 Reputation Level) 
Time spent in forums: 1 Month 11 h 57 m 9 sec
Reputation Power: 437
Quote:
Originally posted by marron79
Yes, I've seen the MFC thing for HTML views, but I'm using Win32. Is there one for Win32?


The source for the implementation of the CHtmlView class is in the VIEWHTML.CPP file in the C:\Program Files\Microsoft Visual Studio\VC98\MFC\SRC directory (actual path may vary). I haven't looked at it closely yet, but it no doubt depends on the rest of MFC to work. Still, you might find some information or ideas you could use.

Searching on Google for -- Win32 HTML parser C++ -- produced some likely hits. One of them linked me to Odin Consulting's OPP (Open Plus Plus) page at http://www.odin-consulting.com/OPP/ . Their OPP library contains an HTML parser that:
"can interpret HTML as a human reader can, understanding tables, fonts and so on. It can also "fix" broken HTML. This is a proof of concept implementation, with better and more compliant versions to come."
The library comes in a "tar ball" that WinZIP can easily handle.

Also, an observation about Petzold. He's been writing pretty much the same book since Windows v2 and possibly even before. When a new version of Windows would come out, he'd update the book to cover the new features. I even have a copy that was rewritten for OS/2's Presentation Manager (very similar to the Windows SDK). Since C was the language and the SDK (software development kit, AKA "Windows API") were the only way when the first book was written, that is the approach his books still offer -- at least up to Windows 95, which is the most recent of his books that I have. That is also why he doesn't cover the Internet and network programming -- in Win16, sockets programming was somewhat cumbersome and required a lot of message processing. Still, it's a good book that does contain a lot of good information about Windows API programming.

Reply With Quote
  #7  
Old February 19th, 2003, 01:29 PM
Scorpions4ever's Avatar
Scorpions4ever Scorpions4ever is offline
Banned ;)
Dev Shed God 5th Plane (7000 - 7499 posts)
 
Join Date: Nov 2001
Location: Glendale, Los Angeles County, California, USA
Posts: 7,442 Scorpions4ever User rank is Major General (70000 - 90000 Reputation Level)Scorpions4ever User rank is Major General (70000 - 90000 Reputation Level)Scorpions4ever User rank is Major General (70000 - 90000 Reputation Level)Scorpions4ever User rank is Major General (70000 - 90000 Reputation Level)Scorpions4ever User rank is Major General (70000 - 90000 Reputation Level)Scorpions4ever User rank is Major General (70000 - 90000 Reputation Level)Scorpions4ever User rank is Major General (70000 - 90000 Reputation Level)Scorpions4ever User rank is Major General (70000 - 90000 Reputation Level)Scorpions4ever User rank is Major General (70000 - 90000 Reputation Level)Scorpions4ever User rank is Major General (70000 - 90000 Reputation Level)Scorpions4ever User rank is Major General (70000 - 90000 Reputation Level)Scorpions4ever User rank is Major General (70000 - 90000 Reputation Level)Scorpions4ever User rank is Major General (70000 - 90000 Reputation Level)Scorpions4ever User rank is Major General (70000 - 90000 Reputation Level) 
Time spent in forums: 1 Month 2 h 5 m 45 sec
Reputation Power: 797
As for getting a HTTP page in Windows, you can also use the WinINET functions http://msdn.microsoft.com/library/d...t_functions.asp

In particular, you would be looking at InternetOpenUrl() and InternetReadFile()/InternetReadFileEx().

These function requires that IE be installed (which it is for practically any windoze installation).

Reply With Quote
  #8  
Old February 19th, 2003, 03:05 PM
marron79's Avatar
marron79 marron79 is offline
Rut row Raggy!
Dev Shed Novice (500 - 999 posts)
 
Join Date: Jul 2001
Location: Tornado Alley
Posts: 558 marron79 User rank is Private First Class (20 - 50 Reputation Level)marron79 User rank is Private First Class (20 - 50 Reputation Level) 
Time spent in forums: 9 h 41 sec
Reputation Power: 8
Quote:
Originally posted by Scorpions4ever
As for getting a HTTP page in Windows, you can also use the WinINET functions http://msdn.microsoft.com/library/d...t_functions.asp

In particular, you would be looking at InternetOpenUrl() and InternetReadFile()/InternetReadFileEx().

These function requires that IE be installed (which it is for practically any windoze installation).


Sounds like what I'm looking for! Thanks for your replies.

Reply With Quote
Reply

Viewing: Dev Shed ForumsProgramming LanguagesC Programming > Obtaining html files and saving them as text in C or C++


Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump


Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 





© 2003-2008 by Developer Shed. All rights reserved. DS Cluster 2 hosted by Hostway