Page 1 of 2 12 Last
  • Jump to page:
    #1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2013
    Posts
    10
    Rep Power
    0

    I need a Programmers insight to my dissertation topic


    Hello,

    I have a number of questions to ask any programmer that knows specifically about encryption/databases/web-development for my chosen dissertation.

    Now, I do not want anyone to do any of the work for me. I just need pushing in the right direction with a few queries I have. My questions will make that clearer. Thank you in advance for anyone willing to answer the questions below. I am welcome to all opinions. You guys are the experts and I'm attempting to become one.

    My dissertation topic that I have chosen is to design and implement a SECURE cloud storage web application.(my own personal dropbox).

    things I want answers too...

    • What programming language(s) will I need to create the application? (in your opinion)


    • I want to upload files and then encrypt them.. What is the best way to learn these techniques and why?


    • Links or ebooks that go into details of how to code security into web applications. mysql injection prevention, etc etc?


    • Anything else I need to consider to take on the project?


    I am very open minded about what I am planning to do, so all advice is invaluable to me. I have a moderate level of programming in particular languages. hence, the stupidly obvious questions I have posted. apologies.

    Any help will be greatly appreciated.

    Thanks, Joey
  2. #2
  3. Banned ;)
    Devshed Supreme Being (6500+ posts)

    Join Date
    Nov 2001
    Location
    Woodland Hills, Los Angeles County, California, USA
    Posts
    9,638
    Rep Power
    4247
    Originally Posted by JoeyG1717
    • What programming language(s) will I need to create the application? (in your opinion)
    Any reasonable programming language should do the trick. I assume that high performance is not a requirement for you. Just pick the language you're most familiar with and see if it will do the job for you. Personally, I would use python or C++. Why? Large # of useful libraries + they are popular languages with widespread usage, so I can ask forums for help if I get stuck with something.

    Originally Posted by JoeyG1717
    • I want to upload files and then encrypt them.. What is the best way to learn these techniques and why?
    You're doing it in the wrong order. You should encrypt the file first and then upload it. Otherwise, someone with a packet sniffer can see the contents of your file as you're uploading it.

    Originally Posted by JoeyG1717
    • Links or ebooks that go into details of how to code security into web applications. mysql injection prevention, etc etc?
    This depends on what programming language you are going to use. Some language frameworks do a better job of security than others IMHO, but security is a process, not a program. Any amount of secure programming isn't going to help if your protocol is weak in the first place.
    Up the Irons
    What Would Jimi Do? Smash amps. Burn guitar. Take the groupies home.
    "Death Before Dishonour, my Friends!!" - Bruce D ickinson, Iron Maiden Aug 20, 2005 @ OzzFest
    Down with Sharon Osbourne

    "I wouldn't hire a butcher to fix my car. I also wouldn't hire a marketing firm to build my website." - Nilpo
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2013
    Posts
    10
    Rep Power
    0
    Thank you for your reply..

    I am reasonably familiar with C++ so that sounds like a pretty solid option. I haven't really looked at python before.. what's good about it? differences to C++?


    Yes encrypt and upload file is what I mean't. Blame my bad typing for that. Is there a way to code a system where encryption would happen and then the file uploaded to the server in one swift click.

    For example...

    Choose file(s) to upload to cloud.

    Button - encrypt (choose password) - button - upload.

    Upload with encryption completed.

    Encrypted file appears on server.


    I hope that made a relative amount of sense.
    Thank you for the comment regarding security. I will look into it further.

    Just to add, I am not asking anyone to code anything for me. I am only looking at research for different techniques and how the coding of cryptography works.
  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2013
    Location
    Usually Japan when not on contract
    Posts
    240
    Rep Power
    12
    Funny, this is just the sort of thing my company is working on right now. Sort of.

    • What programming language(s) will I need to create the application? (in your opinion)

    Doesn't really matter -- we're using a mix of C, Python and PL/pgSQL (DB-related bits), and then whatever supported client applications are written in (and everything else is just file service).

    • I want to upload files and then encrypt them.. What is the best way to learn these techniques and why?

    "Upload and then encrypt" is backwards. You need to encrypt prior to sending anything and decrypt after receiving. Unencrypted traffic is trivial to capture, and unencrypted payloads within unencrypted traffic gives a sniffer everything (context + content). This leaves transit your vulnerable period, which being more temporal than storage will be far less likely to be attacked post-mortem by someone in the future.

    That said, the only secure way to do cloud storage is to physically own the "cloud" (which is why ours isn't "cloud" -- a meaningless marketing term -- its "island"). Encryption is a moving target; what was hard to decrypt ten years ago is trivial today. With everyone slurping data and saving/reselling it forever (and Google, MS and Apple having NSA contracts...) what is safe right now won't be next year. So you can't entirely trust storage to a third party unless you keep a full-sized key, which defeats the point of remote storage.

    As for encryption, there are a bazillion open source utilities for this already, so I wouldn't say its so much that the skill to write encryption processes is what's needed (they are fascinatingly subtle and very easy to get wrong, btw) as just a general idea of what is available across common systems (doing a search for encryption on Linux is a good place to start).

    • Links or ebooks that go into details of how to code security into web applications. mysql injection prevention, etc etc?

    I don't have a handy list of cloud-service-from-scratch how-tos. Anyway, most "cloud" howtos are gigantic, in-depth marketing materials pushing this or that cloud service in disguise. The skills needed really are not cloud centric and are certainly not web-only type skills. Understanding networking in general, encryption in general and most importantly (if you're on a schedule) what facilities the development environment you choose offers. Someone made the Python/C++ recommendation -- not a bad choice, especially if you really aim to write C but leave yourself open to handy tricks available from some of the cooler C++ libs (boost and Qt come to mind).

    The web is going to be a dead end if you want that to be actually secure. You need something native, self-signed, and self-hosted if you actually want security -- the true cost of security is usually paid in inconveniences, and we hate those, so we live in an overwhelmingly unsecured world that is plastered over with stickers and decals shaped like locks, "no-go" signs and the word "SECURE" in bright green letters. The idea that websites can be made secure using the standard certificate scheme, nameservices and https is demonstrably false -- though I'm sure someone will flame this answer, personally invested as they may be in the idea that the "web" is securable.

    • Anything else I need to consider to take on the project?

    Lots of stuff. But they are things that will pop up as you go and require further inquiry. For the moment consider the key points:
    - Storage must be owned by the user
    - Data must be encrypted prior to transmission (this implies that even if you use a web interface to the system, the system needs a real application server and you will require native client code)
    - The transmission layer itself would optimally be encrypted as well
    - A decision should be made early on whether you're doing dumb or smart storage (just files, or structured application data -- getting this wrong so often for decades is what the ongoing nightmare of XML is all about)

    Blah
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2013
    Posts
    10
    Rep Power
    0
    Originally Posted by zxq9
    Funny, this is just the sort of thing my company is working on right now. Sort of.


    Doesn't really matter -- we're using a mix of C, Python and PL/pgSQL (DB-related bits), and then whatever supported client applications are written in (and everything else is just file service).
    I will have to do plenty more research by the looks of things. A mixture of languages is something I was afraid of. I am on a strict time-scale (6 months) to get a fully functioning system. I almost don't want to over complicate it in case I start running out of time.



    Originally Posted by zxq9
    "Upload and then encrypt" is backwards. You need to encrypt prior to sending anything and decrypt after receiving. Unencrypted traffic is trivial to capture, and unencrypted payloads within unencrypted traffic gives a sniffer everything (context + content). This leaves transit your vulnerable period, which being more temporal than storage will be far less likely to be attacked post-mortem by someone in the future.
    Yes, I am aware I wrote this wrong. Bad way to explain what I am trying to achieve. Thankyou for the explanation.


    Originally Posted by zxq9
    That said, the only secure way to do cloud storage is to physically own the "cloud" (which is why ours isn't "cloud" -- a meaningless marketing term -- its "island"). Encryption is a moving target; what was hard to decrypt ten years ago is trivial today. With everyone slurping data and saving/reselling it forever (and Google, MS and Apple having NSA contracts...) what is safe right now won't be next year. So you can't entirely trust storage to a third party unless you keep a full-sized key, which defeats the point of remote storage.

    As for encryption, there are a bazillion open source utilities for this already, so I wouldn't say its so much that the skill to write encryption processes is what's needed (they are fascinatingly subtle and very easy to get wrong, btw) as just a general idea of what is available across common systems (doing a search for encryption on Linux is a good place to start).
    Yes, I have used Linux many times. (ethical-hacking). So I was planning to do most of my testing for the system using that operating system. I will look further into encryption on linux although my system will be created on windows. Hmm, Interesting.


    Originally Posted by zxq9
    I don't have a handy list of cloud-service-from-scratch how-tos. Anyway, most "cloud" howtos are gigantic, in-depth marketing materials pushing this or that cloud service in disguise. The skills needed really are not cloud centric and are certainly not web-only type skills. Understanding networking in general, encryption in general and most importantly (if you're on a schedule) what facilities the development environment you choose offers. Someone made the Python/C++ recommendation -- not a bad choice, especially if you really aim to write C but leave yourself open to handy tricks available from some of the cooler C++ libs (boost and Qt come to mind).

    The web is going to be a dead end if you want that to be actually secure. You need something native, self-signed, and self-hosted if you actually want security -- the true cost of security is usually paid in inconveniences, and we hate those, so we live in an overwhelmingly unsecured world that is plastered over with stickers and decals shaped like locks, "no-go" signs and the word "SECURE" in bright green letters. The idea that websites can be made secure using the standard certificate scheme, nameservices and https is demonstrably false -- though I'm sure someone will flame this answer, personally invested as they may be in the idea that the "web" is securable.
    Imagine my system as Dropbox, But it's personal and much more secure. It will be self-hosted file storage application with a web-interface. I will just sign in to the website and all my uploaded files will be there (encrypted). When I download a file I will have to decrypt it. I am not so sure of what way to approach the decryption part of my system??

    Basically, My dissertation argument is online storage websites are vulnerable and require 3rd party systems for encryption. I am trying to create a personal system that will put the 2 applications as 1. Cheaper, more secure, all run by the user.
    Not completely finalised yet, but that's my idea. anything to add here feel free??

    It doesn't have to be overcomplicated but you understand what I am getting at. The security element of the dissertation is going to primarily focus on the encryption, however I still need to prevent the system from Mysql injection, penetration methodologies, Brute force attack etc etc. If the encryption is fantastic but the site is vulnerable then my project is deemed a failure....


    - The transmission layer itself would optimally be encrypted as well
    - A decision should be made early on whether you're doing dumb or smart storage (just files, or structured application data -- getting this wrong so often for decades is what the ongoing nightmare of XML is all about)
    These two points really stood out to me as something I need to know more about. If you wouldn't mind explaining these in more detail that would be brilliant. Is there sufficient research on these two subjects available?

    Just finally, Thanks for the information. Really detailed and the fact your company is doing something similar means my dissertation has a purpose. I have to write 12,000 words on this project so the system really needs to be completed by January. In your experience do you think 6 months for a novice is long enough to learn and implement this particular system?
  10. #6
  11. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2013
    Location
    Usually Japan when not on contract
    Posts
    240
    Rep Power
    12
    Originally Posted by JoeyG1717
    A mixture of languages is something I was afraid of. I am on a strict time-scale (6 months) to get a fully functioning system.
    Multiple languages actually simplify things, not complicate them -- but you have to understand where each language makes sense to see why sometimes. Consider how having several tools in the carpenter's box simplify his job. That said, you're proving a point when writing a thesis, not writing consumer software, so 99% of what you need can be in Python -- it already does everything Java pretended it would eventually be able to do but still can't in a sane way.

    Internally at Tsuriai we have a very specific definition for research software: "any system that lacks real users". That's what you're writing. So you don't need to worry about performance, just a reasonable proof of your concept.

    So forget about writing performance-needy bits in C and just write Python. Its fast enough for most stuff, and the number-crunching library extensions are already written in C for you. The majority of your initial pre-implementation study will consist of skimming over the Python docs and any necessary external libraries/frameworks (not tutorials, not books, not howtos; the actual Python language docs so many newbies ignore). Once you understand what the core language libraries offer you will realize that most of what you thought would be hard has already been done for you.

    What's left is connecting the dots... more on that below.

    For your case languages other than Python would be a specific flavor of SQL and one procedural language extension to the DB you choose. I recommend Postgres over MySQL for a variety of reasons, but in your case primarily because of the internal security policies you can adopt to obviate most worries over injection (actual adherence to most aspects of the relational model are an added bonus). Postgres' implementation of SQL and its internal procedural extension PL/pgSQL or PL/Python are enough to cover most of the server-side stuff you need.

    There are several web frameworks, DB bindings and ORMs that can get you from the DB to the web in Python. My favorite routes are stripping TurboGears down to the bare rendering processes or using Django. Django is probably your best bet if you're on a limited time budget and need a trivial proof, not a system that competes with an IBM custom solution. (There is a detail here of understanding how to get your Django-built web model to mesh with your hand-built for-real data storage and metadata model, but it isn't hard to do since Django doesn't insulate you too much from the DB.)

    Anyway, my point is you can get away almost entirely with Python and SQL here. Yes, its near-polyglot programming and for a truly universal user-end system you'd need to write native clients in whatever is available (which means Python, Objective-C, and Google-flavored Java at a minimum -- because platform vendors are fantasyland ****s), but that's not what you need to do. You just need a proof of concept that works. Python + SQL (implying also PL/pgSQL or PL/Python).
    I will look further into encryption on linux although my system will be created on windows.
    Windows is a bit of a handicap. If you could get away with it I'd do the whole thing on Linux as a proof first (client and server) and then port whatever you've got over to Windows. Its usually easier to go from Linux to Windows than the other way round, because development on Windows has a funny way of infecting your project with Windows-only bits, which suck to pinpoint and de-funk later -- whereas nearly any runtime you can find on Linux has a pure Windows port available you can utilize later when you want to spin up the Windows/Mac/foo version of ProjectX.
    Imagine my system as Dropbox, But it's personal and much more secure. It will be self-hosted file storage application with a web-interface. I will just sign in to the website and all my uploaded files will be there (encrypted). When I download a file I will have to decrypt it. I am not so sure of what way to approach the decryption part of my system??
    You'll need a client-side application installed -- it will handle file-server authentication and encrypt/decrypt tasks (and just about nothing else unless you want it to). The web is your addressing system. More on this below.
    Basically, My dissertation argument is online storage websites are vulnerable and require 3rd party systems for encryption. I am trying to create a personal system that will put the 2 applications as 1. Cheaper, more secure, all run by the user.
    Not completely finalised yet, but that's my idea. anything to add here feel free??
    A ton to add here -- its the core concept my company is going after because "the cloud" and the web are absolute jokes when it comes to keeping data even marginally secure. (See the digression below for more on the "ton to add here" bit -- since it doesn't advance your core goal.)

    That said, your assumptions lead you to your implementation concept:
    Proposition: The web is insecure, as is third-party storage and authentication.
    Conclusion: It can only be used as a notification system, not a end-to-end development platform, and not as an authentication layer.
    Solution: Initiate login via the web, but make no data manipulation functionality available from there. In other words the web can make A aware of B so they can handshake if they need to, but from the web you can't actually do anything that updates the DB. If you don't consider things like file listings themselves to be security problems, then permit that part to be displayed via the web as well. But the client-end must be running from a client-side installed native application, and the server must receive from a server-side installed system. Whether the user ever actually sees anything other than the web on his screen (other than initial client package installation) is all just an implementation detail. Consider that Dropbox leaves icons, service daemons and other artifacts around the user can interact with.

    Dropbox requires a client-side installation for this reason. You can create links that initiate actions that are registered on the client system involving your client-side application. Consider the HTML <a href> element when you use a "mailto:foo@bar.baz" tag -- it requests that the client system do whatever is registered as the default "mailto" protocol action. Make up your own protocol and register it on the client, or work whatever other form you find reasonable (a browser plugin that calls out to another application -- whatever).

    You can use the web as a notifier, but not a data solution of its own. The client and server can still require mutual authentication via totally separate self-generated keys that have nothing to do with the web, while the web side merely provides the connection address required.
    The security element of the dissertation is going to primarily focus on the encryption, however I still need to prevent the system from Mysql injection, penetration methodologies, Brute force attack etc etc. If the encryption is fantastic but the site is vulnerable then my project is deemed a failure....
    This is what separating the task of notification/addressing from authentication/bona fides is all about. Logging in to the web would give you access to things that you don't really care about someone sniffing (and here you're going to do it over https without an http redirect anyway, so even the stuff you deem "unsensitive" will be transport-protected to the level most websites consider secure). But that is really just a way to make the client app that does the actual authentication, encrypt/decrypt, and file handling aware of what URL to contact when it needs to call home for something.

    And on the point regarding whether or not you're going to deal with "smart" or "dumb" data -- just do dumb data. Dilemma solved. Dumb data in this case involves files and data about those files, not structured data in any meaningful form. This is roughly similar to what most cloud services provide (and exclusively what Dropbox provides). The metadata bit could be as simple as a file name, or as complex as file attributes, ownership, associations, access and update logs, historical diffs, etc. But in your case I think ownership, filenames and maybe user-entered descriptions would be sufficient -- the historical stuff can get complicated and doesn't add any weight to your thesis. (Come to think of it, though, registering git as a web-callable service on the client system nearly does even the complicated stuff I mentioned... you'd just need glue to connect the startup call to a valid git call and handle errors.)

    [A bit of a digression on smart VS dumb data and why we're doing what we're doing at my company follows...]
    The system I am working on does both smart and dumb data, but in very different ways. Any data point which is an understood type of fact has a normalized backend model waiting for it under our system, and that backend builds a profile of whatever sort of entity connections have existed in reality for you over time.

    This kills two birds with one stone the way we do things: our business applications have a rather complete store of meaningful preloads for you when you use them and other applications can build value estimation models ("value" being an enormously broad term here) based on your use of those applications over time (for example, a CAD program with an extension to store its blueprints in our data backend will be incidentally populating a bill of materials, costing model, task timelines and customer historical values because those data points are all the same in reality, just not the way most software works today), and all those data points across users of each discrete system build an internal profile of each user's likely interests at any given time.

    Its the "Google bubble" but done safely -- when you're at work you're "anchored" (fully proxied) to your work "island" (server), and when you're at home you're anchored to your home island (and in any other context you're anchor would be different as well if an island is available, or if you just want to partition your own existence further). Totally different profiles and entirely different users as far as search engines (and any other external services) can see -- and since the user actually owns the island servers themselves we can't make the decision to get prying and evil on them because we don't have access (they physically hold all of the original facts/data), despite running the infrastructure that makes it all tick.

    That way a painter can ask do a search for "latex bondage" at work and get the correct, innocently professional result, and later ask it from home and get the correct, delightfully devious result on his own time where the meaning of the phrase may differ significantly -- while using the same device both places as the same person (from his perspective).

    The entire market's failing here is trying to capture everyone's data and hold it ransom whilst rifling through it (or even outright claiming ownership over it -- check the YouTube, Google+, Dropbox and Facebook IP policies...), tempting users with "free" services to get at that data. That's silly, and once marketers realize there's nowhere left to go in online advertising the current market model will collapse anyway (same goes for industries built on the laughable notions of intellectual property law we currently entertain). What's important to profiling and the powerful user convenience it can provide is not where its indexed but that its indexed and accurately categorized -- which doesn't require a breach of anonymity because its perfectly acceptable for the infrastructure side to deal with anonymous aggregate profile categories and the individual scanning, indexing, and other potentially scary parts to occur on a user's own system that they can exert absolute control over. (duh...) But the long-term market plan for companies like Google goes quite a bit beyond providing consumer convenience, which is why any of this is scary to begin with.
    In your experience do you think 6 months for a novice is long enough to learn and implement this particular system?
    Absolutely. You just need to break it down into achievable-sized chunks and forget about anything that falls outside the minimum feature set that illustrates your point. That means 2 or 2.5 languages, dumb data, web as a notification system and native client/server processes, and learning what libraries are already available to tie those ideas together.

    Sorry for the verbosity of my response. Its late and I don't have time to write a shorter one.
    Last edited by zxq9; July 11th, 2013 at 11:03 AM.
  12. #7
  13. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2013
    Posts
    10
    Rep Power
    0
    What a fantastic reply. Thank you so much. Now my knowledge of some subject you spoke about are not to high so I will do some reading on the things you suggested.

    In regards to planning the system how would you go about it?

    I know there are certain ways companies go about the designing and implementation of a system of this scale.

    Now that I have some of your advice to work on I am probably a little more confused when it comes to 'linking it all together'. Does the system deal with the encryption and uploading/downloading of files whilst the web just serves as a kind of smoke screen of what I want the user to see?

    I think I need to have a system map plan of how I am going to design each part of my dissertation. I think a lot more research is going to be needed.
  14. #8
  15. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2013
    Location
    Usually Japan when not on contract
    Posts
    240
    Rep Power
    12
    I would break it into three sub-projects:
    1. A DB schema that can accept the file data, metadata, and user information. In this simple case it would probably be just a few tables. (a table for the file data, a table for the metadata, maybe a table that tracks file MIME type associations, a user table and a registered device table -- each device needs its own private key, not just a user private key.)
    2. A client program that can generate keys, authenticate to the DB server using a PKI system (so you need the private key on the client), can encrypt a local file and save it as a new file, and can decrypt a file and save it locally.
    3. A web front-end that has a way of password authenticating a user and displaying his file metadata and provide URLs to the data server which the client program in #2 can understand.

    Really not that much work, considering the power of such a system.
  16. #9
  17. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2013
    Posts
    10
    Rep Power
    0
    Originally Posted by zxq9
    I would break it into three sub-projects:
    1. A DB schema that can accept the file data, metadata, and user information. In this simple case it would probably be just a few tables. (a table for the file data, a table for the metadata, maybe a table that tracks file MIME type associations, a user table and a registered device table -- each device needs its own private key, not just a user private key.)
    2. A client program that can generate keys, authenticate to the DB server using a PKI system (so you need the private key on the client), can encrypt a local file and save it as a new file, and can decrypt a file and save it locally.
    3. A web front-end that has a way of password authenticating a user and displaying his file metadata and provide URLs to the data server which the client program in #2 can understand.

    Really not that much work, considering the power of such a system.

    Thanks for that 3 sectioned list. If it's all right (I hope) I am going to try pose to you how I believe this should be done and some of the knowledge I have researched. Excuse my lack of technical language in the following paragraph but your reply of my quotes will help me understand what the hell is going on.

    So, basically I need to create 1 project that is made up of 3 sub-projects.

    Sub-project 1

    Db Schema: server-side? The DB is on the server (which I host) that will have stored all encrypted files that have been uploaded. The database will have tables that will include user login details, registered devices (could you limit devices used for a user?), a Metadata table (for web front-end? what tuples would belong here?) and a Table for file data. Once I have designed the DB ERD maybe you wouldn't mind giving your input to it. I will leave the details of the DB for now.

    I have downloaded PostgreSQL version 9.2.4... Anything else I need DB related?



    Sub-project 2

    Client-Side Application that will generate keys, authenticate to the DB server. Is this just how the client program 'talks' to the server securely? A user needs a private key and so does the registered device.. why so? I know that a PKI system to implement this should be used. Need more research on the coding of this area. The program should be able to select a file from the users computer(local) then encrypt it. (Something along the lines of a script that is assigned to an 'encrypt' button on a GUI??). The new encrypted file would be saved. The client program would then send that over to the Db. (assuming another script created to send encrypted file to the server that would like to a 'save file' button on a gui?)

    I am not sure the process of how the decrypt and download files is going to work... But this is my assumption.

    Web front-end link, Shows encrypted file. User clicks link, Gui appears with 'Decrypt' button. User decrypts, Gui appears with 'save file' button. File saves where users specifes. Script needed for each button respectively.

    I also assume that getting the client to understand what the web is asking it to do is complicated? does this refer back to the protocols you mentioned before? I.E mailto: etc.

    I have downloaded python 3.3.2 is there any libraries that would be good for the scripts that need to be created?


    sub-project 3

    Web Front-end. I will keep this simple.

    User goes to domain. Login required. User successfully logs in.(the login is linked to the db server right?)

    All files appear as links for the user to browse. Another link on the page the says 'upload Files' (needs to link to the client program that is created for encryption) Gui/web-page opens for the user to begin uploading.

    A user can also click links for the decryption method already spoken about above.


    Moral of the web front-end project is that it just appears as if it is doing all the hard work but its actually just a smoke-screen of what's going on behind closed doors? The web knows about the client program but it has no idea what it effectively does. In effect this makes it more secure as a project. correct?

    Ok I think I am done. Now I know there are way to many questions in this reply (apologies). If you can reply then just summarise each sub-project as much as you feel necessary.


    Front end web site can be just coded with HTML/HTML5/PHP?


    Thanks again.
  18. #10
  19. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2013
    Location
    Usually Japan when not on contract
    Posts
    240
    Rep Power
    12
    In broad strokes you've got it right. The interface (GUI) and web link elements are actually easier than you're imagining them to be. We don't need an "encrypt" or "decrypt" button -- you're thinking like the developer. Instead think like the user. Everything is going to be encrypted by default, so why trouble the user with that? Your client application encrypts as part of sending and decrypts as part of receiving so a "get" and "send" are all you need (and not even that if you want to provide automatic syncing).

    Db Schema: server-side? The DB is on the server (which I host) that will have stored all encrypted files that have been uploaded. The database will have tables that will include user login details, registered devices (could you limit devices used for a user?), a Metadata table (for web front-end? what tuples would belong here?) and a Table for file data. Once I have designed the DB ERD maybe you wouldn't mind giving your input to it. I will leave the details of the DB for now.
    There are two halves to the user login -- the stuff the stuff the web application is permitted to see (the web login username/password) and the actual client login to the DB. Web frameworks nearly always have a (stupid) setup where the framework itself is a db user and owns the database. Don't do this.

    Set up a db user for the framework that doesn't own anything, but is allowed to validate logins (use access on a password validation function, but not read access to the hashed password column) and, if you need it to be able to, create new users (insert/create on the appropriate rows on the web side of the user table).

    Then set up a db user to represent the actual user. In Postgres users are called "roles" and roles can be members of other roles. So beneath the role that represents the actual user you would create a role per device that the person will log in from. That way each device can have its own private key (which authenticates the device and the user -- and let's you audit/rollback historical changes to the actual user's data in the event of a breach) and the user isn't bothered with carrying around his private key everywhere (and devices can independently be removed from login permission).

    I got a bit involved with the authentication part there, but that's really the hardest part of the whole system, so it requires a bit of explanation -- especially since web framework design will nearly always steer you in enormously insecure directions.
    I have downloaded PostgreSQL version 9.2.4... Anything else I need DB related?
    Nope. Postgres has all the functionality you need to implement the server side out of the box. The only missing link on the server is a web front end. More on that below.
    Client-Side Application that will generate keys, authenticate to the DB server. Is this just how the client program 'talks' to the server securely?
    Basically, yes. The first time a device is used the user needs to register the device with the db, and to make this easy for the user its better if the client knows how to produce a key pair on its own. The key is only used for authentication. Authentication goes both directions, the client validates its really the client, and the server authenticates to the client to prove that its really the server. Since this is a closed, self-own system you don't need certificates, you just need clients that have the server's public key pre-loaded. After initial authentication the db and the client will create a symmetric key and actually use that for the connection. Read up on ssh and the way TCP works.
    The program should be able to select a file from the users computer(local) then encrypt it. (Something along the lines of a script that is assigned to an 'encrypt' button on a GUI??). The new encrypted file would be saved. The client program would then send that over to the Db. (assuming another script created to send encrypted file to the server that would like to a 'save file' button on a gui?)
    Its best not to make the user have to know all that since this is the process that must always happen anyway. Instead, just have a "sync file" button or something, and when its clicked everything you wrote above happens.There is a whole world of even cooler/user-friendlier stuff you could do here, but none of it would make the point of your thesis any stronger.
    I am not sure the process of how the decrypt and download files is going to work... But this is my assumption.
    Same thing, just in reverse. The user sees a file that isn't the same version he's got on the local system. When he clicks it the download/decrypt process happens in reverse.

    We haven't discussed the encrypt/decrypt keys. This is separate from the authentication bit and has to be stored in some form on the server and passed as part of login so that the client has access to the right key to encrypt/decrypt. But worry about this later. For right now don't even worry about file encryption at all -- encryption just makes you think about encryption too much (for now). Files are just large numbers, so for the time being build a system that can shovel numbers between the client and the server with accuracy. We can make those numbers appear as jibberish to outsiders later.
    Web front-end link, Shows encrypted file.

    ...

    I also assume that getting the client to understand what the web is asking it to do is complicated? does this refer back to the protocols you mentioned before? I.E mailto: etc.
    You got ahead of yourself with the GUI stuff, but basically yes -- except that its not complicated. You can just provide a URL that the server can understand. There is a (somewhat pointless) "standard" for URL schemes that we can avoid thinking too hard about if we adhere to. We make up a protocol name "foo", and determine that the "address" of a file is the username plus whatever additional organization stuff you want to add, then the filename. Just like a Unix file location (since that's what URLs are roughly modeled after). So "foo:myserver.com/my_username/file.txt" is the accessor for the file. Its not really stored at "/my_username/file.txt" on the server, but to the outside world that's the address of the file. The client is registered to the protocol named "foo" on the client desktop (the same way Thunderbird or Evolution or whatever is registered to the "mailto" protocol), so when a browser of the form to a "foo:xxx" is clicked, your client program starts up with the argument "myserver.com/my_username/file.txt", which the server understands as requiring authentication to one of the devices registered to the role "my_username" and, upon authentication, the delivery of the large number called "file.txt".
    I have downloaded python 3.3.2 is there any libraries that would be good for the scripts that need to be created?
    Your closest target should be learning how to connect to Postgres from a Python script using the psycopg2 bindings. Its really easy once you've done it once. Just do password authentication and plaintext for now. You can get fancy with the SSL stuff later.

    Later on you'll need openssl to generate keys and certificates, and you'll need to read the Postgres docs pertaining to authentication -- but worry about this later, right now your goal is to make a functional plain text system that does basically what you want so that you can explore the basic problem of sending/receiving data from a client to a server.
    Web Front-end. I will keep this simple.
    ...
    Front end web site can be just coded with HTML/HTML5/PHP?
    Keep it annoyingly simple for now. I think the discussion above give you some idea what is required here. As I wrote above I'd probably opt for using Django instead of PHP, just because it reduces the language requirements that much more. But if you're comfortable with PHP there is nothing wrong with it -- the db user that the web side will be using will lack priviledges to do anything bad anyway. This really comes down to simplicity: from where I stand a Python framework would be simpler, but if you already know PHP then its not anything extra to learn (whereas learning the tiny bit necessary to make a Django frontend would be). In the end, it doesn't matter what you use to write it, just pick something.

    Hopefully this meandering explanation shed some light. The moral of the story is to keep it brainstompingly simple at this point. Make a table in Postgres that can receive, say, a text string. Make a Python script that can send one. Then make the script so it can also receive one. Then make the schema more interesting so the strings stored are associated with "filenames" and user entries. Then make the schema more interesting so that different Postgres roles/users have privileged access to their files, etc. Build up from there, always keeping a working core. This is unfamiliar territory for you, so if you try to do everything at once you'll just watch it explode in your face and go gray early.
  20. #11
  21. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2013
    Posts
    10
    Rep Power
    0
    Hopefully this meandering explanation shed some light. The moral of the story is to keep it brainstompingly simple at this point. Make a table in Postgres that can receive, say, a text string. Make a Python script that can send one. Then make the script so it can also receive one. Then make the schema more interesting so the strings stored are associated with "filenames" and user entries. Then make the schema more interesting so that different Postgres roles/users have privileged access to their files, etc. Build up from there, always keeping a working core. This is unfamiliar territory for you, so if you try to do everything at once you'll just watch it explode in your face and go gray early.
    This last bit really helped me put the rest of your reply in to perspective. I have been thinking about the project as this incredibly new difficult system that I have no knowledge about any of the languages it will use. I will slowly create some python scripts that authenticate with postgres. I will build up a Db_schema and see how far a I get. Hopefully my next reply will feature some code onwards and upwards as they say..
  22. #12
  23. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2013
    Posts
    10
    Rep Power
    0
    Hello again,

    I have been doing some work on python scripts. Very simple stuff. Connecting to my postgres database. Inserting text into a column using a simple query. I can display some information from a table also via simple queries.

    Now the thing I am slighlty confused about is how do I get these things working 'online'. I have django set up and can produce a basic 'Django powered webpage'. I just don't know the process of how you get it to runs scripts.

    I have done a lot of research and got a lot of information that makes me confident I can code this stuff pretty adequately. The hardest bit I think for me is putting everything together. I am thinking like a Front-end web developer and I believe its hindering my progress of how to produce the outputs I desire.

    If I knew a basic outline of how to -

    • Execute an upload/send of a file to the database online

    • how scripts are executed using django


    If you can just point out the basic way of going about this then I believe the system can start to be firmly put into development.


    Also, I would just like to add your thanks for suggesting django-python. It is very simple to understand and I think the language is really easy to learn to far. It makes a change for the struggles I have with Java (I hate it).

    Just so you get a better idea of how I am trying to test this system...

    I have installed:

    Django
    Python 3.3
    psycopg2
    postgresql
    postgresql admin III
    wampserver


    So I have a database which runs locally. (just for testing) I will be putting it on a dedicated server area.

    I have the small django server which can run if I need it to.

    An area in wamp setup for the front-end website.


    Apologies for my confusion. I just want to have 50% of my system done before the end of September so I have constant questions that I want answering.
    If you get time to continue helping me out then that would be greatly appreciated.

    finally, would you mind if I used your username from this site as a written reference? (feel free to decline)
  24. #13
  25. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2013
    Location
    Usually Japan when not on contract
    Posts
    240
    Rep Power
    12
    Originally Posted by JoeyG1717
    ...Connecting to my postgres database. Inserting text into a column using a simple query. I can display some information from a table also via simple queries.
    Wonderful! Then you have a proto-client application already. The client bit is a totally independent application from the web interface.
    Now the thing I am slighlty confused about is how do I get these things working 'online'. I have django set up and can produce a basic 'Django powered webpage'. I just don't know the process of how you get it to runs scripts.
    Everything in Django is just a Python file, so anything can be a callable. Unlike in Java, you can just declare functions in Python without having to wrap that in a class. So if you need a Django view to execute some additional action in the background (in other words, if you need a view function to have a side effect other than db I/O and webpage rendering) you can either call functions that do whatever you want from within the view or write whatever process you want inline directly within the view itself.

    This is how I used the Django rendering engine to make an ODF report rendering module [link] -- by calling out to some other processes I wrote that can properly pack/unpack ODF files and use the Django template language to render the XML with whatever db data is needed within the document. (If you read what's in the tarball its still in a bit of an ugly/works-for-me status -- depends on calls to some bash scripts, etc. -- but comparing it with what's in the Django docs and you'll probably grok what's happening fairly quickly.).
    python Code:
    def my_view(request, template='some_template.html'):
        # Generate a context dict relevant to the request
        context = gen_context(request)
     
        # Let's do some totally arbitrary stuff
        f = open('/tmp/foo.txt', 'w')
        f.write("I'm a new text file! Django made me on a whim! Save me from entropy! AHH!!!")
        f.close()
     
        # Do what is expected: return a rendered page to the server
        return render(request, template, context)

    Before viewing the page that is pointed to by my_view() check /tmp. There is no foo.txt (unless you already had another one there). Then open the URL owned by my_view() and then try "cat /tmp/foo.txt" and you'll see the text file created by the view sitting there.

    So "how to run a script from Django?" That's like asking "how to call a function in Django?" -- just write what you want it to do. Its really just that simple. Kinda cool -- but obviously you don't want to get too crazy with this powertool -- might cut your finger(s off).
    I have done a lot of research and got a lot of information that makes me confident I can code this stuff pretty adequately. The hardest bit I think for me is putting everything together. I am thinking like a Front-end web developer and I believe its hindering my progress of how to produce the outputs I desire.
    You have identified your conceptual problem: you are way overly focus on the web. You should be focused on the client <-> db part more than anything right now. The web part is so simple to do that you don't need to sweat it right now. The web is a bunch of trivial hacks over a document publication protocol -- so leave it like that, trivial and publication-oriented. You have more interesting goals in mind, so I recommend focusing on those first. Tying them together isn't so hard -- and when you figure out how/why you'll go "GHAH! That's it?"

    There is an interesting conceptual implication in reminding ourselves that the web is built on a publication protocol: it is not important of its own accord. In other words, to generate data within the web is a hack, and always feels that way no matter what environment you're in (unless you've never seen clean I/O and then it might feel like the One True Way for lack of known alternatives). But publishing data on the web is awesome. Its perfect for that. So what lesson lies here for your proposed system? The client application, for the most part, is the only thing doing any new data creation. The web is just providing a quick, clickable table of contents of what resides there. The client is also the only thing actually pulling files. So don't worry about the web too much.
    • Execute an upload/send of a file to the database online
    If by "online" you mean "via the web" then you're violating all the security stuff we talked about (a conceptual regression). Your client application does this. You can display a link on a webpage though that points to the db server using a protocol name you make up, and register that protocol name with your local desktop (or whatever) to call your client app. So let's call the protocol "fudj" short for "File Uploader de Joey". A link would look like <a href="fudj://username@server_name.com"> or something very similar. Since everything after "fudj" is going to be passed to the client as its initial argument by the window manager anyway you can form the rest of the link any way you want, but generally speaking URL format is easy to deal with.
    • how scripts are executed using django
    See above. But I don't see you really needing any scripts from Django. I think you were getting mixed up, like that Django would call a script and make an upload happen -- but that's crazy, since the client side has to push the data for there to be an upload, and Django can initiate a script action on the client side since it runs on the server. The way your web page can signal the client side is by providing a URL such as the one above, which provides enough information for the client app to find the server and initiate the upload process.
    Django
    Python 3.3
    psycopg2
    postgresql
    postgresql admin III
    wampserver
    Cool. That's just able all you'll need (though wampserver installs a lot more than you need). I think developing on Windows will be a litle trickier than on Linux, since the community of Windows developers is so small compared to the Linux one... when you run into platform-specific issues you might just be on your own.

    I'd steer clear of any gui or web admin interfaces for Postgres for now (actually, forever). I strongly recommend learning how to use psql directly. Its a lot faster for many things in the same way the shell is a lot faster than a windowed desktop and its independently scriptable. It also doesn't hide anything from you and it doesn't let you zone out while looking at lines on the screen, fooling yourself into thinking that you understand what you're seeing. Relational data is way too complex to display graphically -- the only place that you can really visualize it is in your mind. Pretty screens and clickables distract from that. I'm not saying this particular project is very complex or anything, but if you learn even a tiny bit about Postgres it will benefit you for a long, long time -- so developing good habits now is important.

    Everything you need for development and testing can be done on a single computer, though its nice to do it on two just so you know for a fact that you're never tying anything to local access assumptions.
    finally, would you mind if I used your username from this site as a written reference? (feel free to decline)
    Sure, you can cite me as a reference. That's sort of a big deal in academia so here you go: My real name is Craig Everett, I work for a computing services company called Tsuriai out here in Japan and my personal site is http://zxq9.com.

    I might elaborate on a few of the challenges you face on my blog. Actually, one of the issues you might run into is getting Django to "see" tables in Postgres that you created directly in SQL -- an issue I addressed [here]. Don't expect that entire post to make sense to you up front, but do expect a rather robust understanding of all issues involves to creep up on you over the next month or so.
  26. #14
  27. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2013
    Posts
    10
    Rep Power
    0
    Wonderful! Then you have a proto-client application already. The client bit is a totally independent application from the web interface.

    Everything in Django is just a Python file, so anything can be a callable. Unlike in Java, you can just declare functions in Python without having to wrap that in a class. So if you need a Django view to execute some additional action in the background (in other words, if you need a view function to have a side effect other than db I/O and webpage rendering) you can either call functions that do whatever you want from within the view or write whatever process you want inline directly within the view itself.

    This is how I used the Django rendering engine to make an ODF report rendering module [link] -- by calling out to some other processes I wrote that can properly pack/unpack ODF files and use the Django template language to render the XML with whatever db data is needed within the document. (If you read what's in the tarball its still in a bit of an ugly/works-for-me status -- depends on calls to some bash scripts, etc. -- but comparing it with what's in the Django docs and you'll probably grok what's happening fairly quickly.).
    python Code:
    def my_view(request, template='some_template.html'):
        # Generate a context dict relevant to the request
        context = gen_context(request)
     
        # Let's do some totally arbitrary stuff
        f = open('/tmp/foo.txt', 'w')
        f.write("I'm a new text file! Django made me on a whim! Save me from entropy! AHH!!!")
        f.close()
     
        # Do what is expected: return a rendered page to the server
        return render(request, template, context)

    Before viewing the page that is pointed to by my_view() check /tmp. There is no foo.txt (unless you already had another one there). Then open the URL owned by my_view() and then try "cat /tmp/foo.txt" and you'll see the text file created by the view sitting there.
    I did go into the views and url sections of django. I think I am going to leave this bit until the client system is completed. I really have over-thought the situation. make a protocol and call it. Should of known it could be that simple. Having nightmares still about Java, when in reality Python is just a much more friendly language. I need to wake up.

    So "how to run a script from Django?" That's like asking "how to call a function in Django?" -- just write what you want it to do. Its really just that simple. Kinda cool -- but obviously you don't want to get too crazy with this powertool -- might cut your finger(s off).

    You have identified your conceptual problem: you are way overly focus on the web. You should be focused on the client <-> db part more than anything right now. The web part is so simple to do that you don't need to sweat it right now. The web is a bunch of trivial hacks over a document publication protocol -- so leave it like that, trivial and publication-oriented. You have more interesting goals in mind, so I recommend focusing on those first. Tying them together isn't so hard -- and when you figure out how/why you'll go "GHAH! That's it?"

    If by "online" you mean "via the web" then you're violating all the security stuff we talked about (a conceptual regression). Your client application does this. You can display a link on a webpage though that points to the db server using a protocol name you make up, and register that protocol name with your local desktop (or whatever) to call your client app. So let's call the protocol "fudj" short for "File Uploader de Joey". A link would look like <a href="fudj://username@server_name.com"> or something very similar. Since everything after "fudj" is going to be passed to the client as its initial argument by the window manager anyway you can form the rest of the link any way you want, but generally speaking URL format is easy to deal with.
    yes, I realise this now. The web stuff is really not going to be a problem. Your protocol example just spelled out to me 'Joey, your not thinking clearly' It really is that simple.


    Cool. That's just able all you'll need (though wampserver installs a lot more than you need). I think developing on Windows will be a litle trickier than on Linux, since the community of Windows developers is so small compared to the Linux one... when you run into platform-specific issues you might just be on your own.
    Yeh, I am still contemplating this. The issue I keep asking myself is I don't trust my University lecturers to be Linux developers. I would be surprised if my Dissertation supervisor was qualified in anything to do with Linux. This basically puts me in the category of relying solely on the internet and forums to get me through any issues I may come across. Windows has been nothing short of a nightmare so far.


    Everything you need for development and testing can be done on a single computer, though its nice to do it on two just so you know for a fact that you're never tying anything to local access assumptions.
    Well when I present my system at the end of my university year I was going to show how it works over 2 computers. 1 computer was going to be nothing more than a web-server with the database on it. The other computer is going to use the system to upload and download a file etc etc. I thought that would give me more credit from markers.

    Sure, you can cite me as a reference. That's sort of a big deal in academia so here you go: My real name is Craig Everett, I work for a computing services company called Tsuriai out here in Japan and my personal site is http://zxq9.com.

    I might elaborate on a few of the challenges you face on my blog. Actually, one of the issues you might run into is getting Django to "see" tables in Postgres that you created directly in SQL -- an issue I addressed [here]. Don't expect that entire post to make sense to you up front, but do expect a rather robust understanding of all issues involves to creep up on you over the next month or so.
    Thanks for that. I have to use everything as a reference these days. So anything you help me with, or talk to me about, they can't drop my grade.

    So would you agree that I should just forget that the web exists for now?? I will just focus on python scripts using the shell.
    With the client app do you think the system could be done in one script or do I need to break it down into different sections?

    Now that I have scripted some basic queries how should I proceed to making this more 'interesting'....
  28. #15
  29. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2013
    Location
    Usually Japan when not on contract
    Posts
    240
    Rep Power
    12
    Originally Posted by JoeyG1717
    I did go into the views and url sections of django. I think I am going to leave this bit until the client system is completed.
    This is probably a good decision. Once you get the client-server bits working the web part will be a snap.
    [...about Linux] I keep asking myself is I don't trust my University lecturers to be Linux developers. I would be surprised if my Dissertation supervisor was qualified in anything to do with Linux. This basically puts me in the category of relying solely on the internet and forums to get me through any issues I may come across. Windows has been nothing short of a nightmare so far.
    Considering that Linux is free and you want to do two computers anyway... it wouldn't be so hard to set a Linux server up for yourself, especially if you've got an ancient or otherwise super-cheap/free computer sitting around somewhere. We actually do a lot of server-side development for the < 1000 users segment that targets Linux on "netbook" class AMD processors (caveat to this is that those systems have SSDs for the core system files, tons of uber-fast memory and giant HD arrays) and haven't had any performance issues, even using Django. Doing a single user system your performance bottleneck will be the network, so you could easily get away with grabbing an ancient single-core Celeron laptop from the garbage (seriously) and using it as a dev server.

    You could also set up a virtualized environment with two minimal Linux instances running on your Windows system and doing their network communication across your virtual bridge. I don't really know what is available for Windows virtualization on the FOSS side of the fence, but I've done this to test out concepts from time to time using KVM on Linux (like on a long flight where I can't access anything and really play around). I'm sure something similar for Windows must exist.
    Well when I present my system at the end of my university year I was going to show how it works over 2 computers. 1 computer was going to be nothing more than a web-server with the database on it. The other computer is going to use the system to upload and download a file etc etc. I thought that would give me more credit from markers.
    It totally would. Do it that way. But you don't require such a setup just now, today, to get your basic client/server stuff sorted. Connecting from a Python program to Postgres is just "psycopg2.connect(auth_string)". The formation of auth_string is all that changes when you move from a localhost service to a remote one (yeah, some firewall/host OS setup is required, but that's always true) -- and hey, whaddya know, the formation of auth_string is exactly the part that is going to become a function that parses the web-provided URL! Read that last bit again -- the formation of auth_string will become its own function, which means you can make a (mostly) empty function right now that returns a valid auth_string, and just elaborate on it later. So don't sweat it.
    python Code:
    server = "localhost"
    auth_string = gen_auth_string(server)
    psycopg2.connect(auth_string)

    today becomes
    python Code:
    settings = gen_settings(opts, args)
    auth_string = gen_auth_string(settings)
    psycopg2.connect(auth_string)

    tomorrow. Easy to do that then, and figure out how to pass files between a client program (local or not) and the db.
    So would you agree that I should just forget that the web exists for now?? I will just focus on python scripts using the shell.
    I wish the world would forget the web exists for at least a week (and go meet each other again...) -- convincing you to do so until your client program basically works is a small victory. As for being able to run the script effectively from the shell -- yes, do that. This does a few cool things for you, actually. For one, your program already will have multiple interfaces -- anyone else who wants to come behind you and tack on a feature that depends on your system is free to, because it has a command line interface that works. Going a step further, of course, is beneficial -- you can make a full-blown GUI later without losing any sleep over it.
    With the client app do you think the system could be done in one script or do I need to break it down into different sections?
    The above statement says it all: Different tasks are going to need their own sections. But those tasks might be small, and just be functions, or might be simulated stateful things, and be objects or be GUI stuff or whatever. I think you're really asking me if you need to separate things into different Python modules already -- and no, you don't need to do that yet. Python is really easy to chop into pieces, so I usually wait until my single file grows to the size that its more annoying to skip around within the file than to open another one in a tab (in vi or emacs or whatever you use). That size tends to hover somewhere between 300 and 1000 lines. Functions are about the same, but within the file. Once a function gets to either contain a lot of nested conditional logic (including "try...except" stuff -- yuk) or just gets to be a half screen or longer, it usually gets refactored into smaller functions. Classes can grow to be super, ungodly behemoths if you're not careful. That's usually a sign that your methods are doing way too much that could be generally useful and should instead be wrappers to external function calls.

    And that's all I'm going to say about program structure for now. For the most part you'll just start feeling yourself having to remember too much at once (too much state), or having to consider cases too often (too many conditionals), or having more than one obvious correct exit from a function (proliferation of "return" statements), or whatever and get annoyed working on whatever you're working on. That's the time you should factor some of that crap out into its own function. Once you notice an obvious rift in the sort of things your functions handle then its time to separate them into modules. And so on. Just let Frankenstein evolve on his own for now.

    Your biggest job right now is getting the minutiae of db I/O down and figuring out what sort of schema suits your purpose.
    Now that I have scripted some basic queries how should I proceed to making this more 'interesting'....
    The last sentence I wrote is it. Decide what your schema should look like in the db. This is nearly always the most critical part of designing the system, actually, and we haven't talked about it much. You need to write an SQL schema that makes sense for your task. How are you going to store the user data? What does "user data" mean? How about file data? That's always going to mean some metadata plus the actual contents of the file. For now, just keep it dead simple. Show me a sample schema and I'll give you my thoughts on it.

    The step after that is where you go back to messing with your Python script -- because its only then that you know what queries your script needs to make (sort of important...).

    If any of this didn't make sense you'll have to forgive me. My moai mates caught me just after work and insisted I "warm up" prior to tonight's drunken adventure by getting jeezled up a little ahead of time. But its OK. I am the sort of man who knows its the room and not me spinning.
    Last edited by zxq9; July 19th, 2013 at 08:02 AM.
Page 1 of 2 12 Last
  • Jump to page:

IMN logo majestic logo threadwatch logo seochat tools logo