There are utilities for this sort of thing, but if you want a DIY experience to force yourself to learn a few tricks, I would recommend focusing on skills that are useful in as many settings as possible.
If I were to design something like this I'd ask a few questions first:
- What sort of data do we need to see?
- What sort of data do we want to see?
- What is the smallest feature set which, when completed, defines project success?
- Does the data need to be retrieved when the page is looked at, or can it be cached and updated on a schedule?
- Does the data need to be historical, or is the current/latest update all we care about?
- Why are we using the web? Might we ever want to use another interface? (i.e. Android application)
- Might we ever want to add a new tracking criteria after the initial deployment?
If I were to build a prototype I'd start with how to get the data first and then where to keep it. A scripting language is an ideal way to query system data (unless you want to write your own top type program -- then C is pretty straightforward), and languages like Python and Perl have networking, database, GUI and unix shell bindings that are easy to use.
To keep things simple I'd write a scheduled reporter based on a system cron job (receiving commands for dynamic output requires either writing a daemon or maintaining an open connection -- way too complex for this stage).
I'd write a script that, say, dumps the output of "df" to a text file somewhere. Then on the server I'd get Postgres (an awesome, free RDBMS server) running and create a very
simple schema to keep track of incoming system, IP, incoming timestamp, reported timestamp, command string, and a name for the output type.
Now I've got something to send and some place to receive it. Then I'd try to manually send the content via psql (a text Postgres client) from the client to the server. Then I'd write another script that either connects directly (straight from Python using the psycopg2 library is easy) or via psql (most straightforward option if the script is in Bash or some language that lacks Postgres bindings) to the server and see if I can get the insert to the server DB to work.
If it does, then I'd write another script that calls both of the ones just written in order -- dump the data to a text file, then call the Postgres server and insert it as table data. If that works, then I'd write a cron job to call the script just written every X minutes/hours/whatever and check to make sure its actually doing its job without problems.
If all that works then I'd rework the set of three scripts a bit. The final script that controls everything would grow a "check settings" type function that reads through a list of scripts to run, or a set of commands to run (whatever). When called it would now check its settings and run through the list of data pullers without me needing to write a new job for it every time. Then I'd think of a way to check whether a file that we've got on the client side has been sent yet or not (this is networking -- things will go down) and
think about how the program should behave when it can't connect to the server, and if I'm really interested in it I'd probably write a way for it to become aware of how long its been since it was run last so I can report gap times to the server.
Then its back to the DB to expand the schema to account for the new data. Also, our toy schema is no longer good enough and we need real authentication now -- so that means creating roles for each system that's sending and a way for those systems to authenticate to the DB so you're not just letting anyone write/read to your server (using Postgres built-in user/role model massively simplifies this, btw).
There remain a million tweaks on the client side I'd write into a TODO/Wishlist and forget about for the time being. The point would be, up to this point, whether you can create a system data reporting infrastructure based on easy to handle, broadly useful
, readily available components that don't cost you anything (so far that's *nix, cron, Python/Bash/Perl/whatever, and Postgres). Nothing up to now is very hard, but all if it is very useful to know.
The next step would be to get it output somewhere useful. Since we're just doing static data dumps I'd probably just write a script that builds a static HTML page from the data in the DB whenever asked for now. You can get all crazy with web frameworks and things later -- frameworks like Django are so easy this should really be an afterthought. Focus first on the low-order task of checking to make sure your webserver actually can serve pages and that you understand where old-fashioned HTML document files get stored. And then script the construction of them.
Once I was satisfied I'd probably do the "build a page on request from the DB" thing. I'm intimately familiar with Django, so I'd use that, but pick your poison. Be warned: all frameworks suck for non-trivial data
. So the key here is to keep yours trivial and stupid, just like everything else on the web.
At this point I might check my TODO/wishlist file, or not. Depends on how much time I had available and how interested I still was in the project.
The very last thing I'd do would be to tackle the task of making a live request from a web page generate refresh responses from the client computers. A lot more goes into that, so it needs to be last. That's also where you will stand a very high chance of opening gigantic security holes in your network without realizing it -- so once again, last.
If you look back over my enormous, un-edited brain dump you'll see that while the details are focused on the task you specified, the underlying process of "break this bite-sized chunk off and explore it, then this one, etc." as well as the idea of focusing on using broadly useful, available, accessible things up front (languages, kernels, DB servers, etc.) and then niche-use stuff last (a web framework -- none of which may be popular next year) is the way quite a few one-man FOSS projects go. Well, if the guy has built a thing or two already, that is.
Blah. Movie time.