#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2013
    Posts
    7
    Rep Power
    0

    Am I looking at the right tools for this project?


    Hi guys,

    I'm fairly new to .NET development and databases. I've been eyeballing them, but never had a chance to get in. I have about a year and a half of college education in programming and working hard on my programming degree (taking more c++ and assembly this fall).

    An opportunity fell into my hands to program (which is what I want to do with my life), and this is what I have to do:

    I have to design a program with simple and clean interface that is capable of listening to folder/file changes in specified directories (the directories would be specified by the user in the options menu).

    Every time a new file is created in the directory (we scan documents into pdfs and send them to specific directories), the program has to automatically rename that new file.

    The file name will be determined by the user. As in: the user can choose a specific file name, or if they don't an just check the auto box, the file will be named with the date of the file creation. At the end of each file name will be a sequence number for that specific name or date. So if 10 files were scanned on 08/10/2013, they would be renamed as 08102013_1, 08102013_2, etc.

    The program has to have a database to store links to those files, as well as specific information from the document itself, such as location, date, name of the originator, etc. So, some sort of pdf parsing will have to occur (I'm assuming).

    A user has to be able to search through the database by... let's say a specific date, and be able to select those files with an option of email them as a batch to the requesting party with an auto-generated invoice based on the number of documents sent.

    I have experience with C++, Java and Python from college and free time. This program will be used on Windows systems, so it's probably a better idea to go with Windows-favoring languages for better compatibility. I've never done GUI-like project with C++ and after some research it appears to be more of a lengthy process than Java or C#. Java is good, but since I'll be working in .NET environment, C# would probably be a better choice.

    I am vigorously learning C# right now in my off-time (I've put in over 8 hours in these last 2 days). It seems to be a fantastic language... I'm actually very impressed and sad that I haven't looked at it before.

    So, I figured that using C# in combination with some sort of SQL database would be a good set of tools for this? I would need to create a Windows service application for listening/renaming that starts automatically and windows start-up, for front-end I would need to create the application that displays the database with options of sending the files, and lets the user change options of the service (as far as naming and directories).

    Am I looking at the right set of tools? Is there a better way? I figured I'd ask for opinions from more experienced crowd so that I don't realize half-way through learning new tools and creating this application that I've made a bad choice.
  2. #2
  3. Lord of the Dance
    Devshed Expert (3500 - 3999 posts)

    Join Date
    Oct 2003
    Posts
    3,614
    Rep Power
    1945
    Sounds like C# can be the correct tools for this, especially with its FileSystemWatcher.

    You didn't really explain how the process should be, e.g. how is the file place in the folder or how/when the file name should be changed by the user.
    But with C# you have the option to develop the application based as a service, with standard GUI interface or web based (ASP .NET).
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2013
    Posts
    7
    Rep Power
    0
    Originally Posted by MrFujin
    You didn't really explain how the process should be, e.g. how is the file place in the folder or how/when the file name should be changed by the user.
    The files are placed into folders through scanners. We have a lot of hand written documents that we constantly store in electronic format as PDF. Scanners have options of sending images to specific directories when they're scanned. When a new file appears in the folder, the program should rename it.

    I figured each file that is already in the folder will be stored in the database, which the program can reference to check if any new files are added or deleted. If it's renamed, the database can simply delete the old reference and add the new one (unless there's a better way).

    I want to have two options in the program as far as naming:

    1. User opens up the program, checks the manual naming option and writes in the name they want their files to have. The program then names each file with that name + sequence number.

    2. User opens up the program and selects the automatic option, which will name each file that it detects as date + sequence number for that date.

    One issue I'm predicting is this: When a user manually moves a certain file that's already named the way user wants it to be into a target directory, the program will detect a new file entering and rename it. So I have to figure out how to prevent that.


    My biggest worry is the database. I have no experience besides basic Access. I'm assuming SQL is the best way to go (maybe MySQL because it's open source/free)? It'll be the bulk of my program and I have to learn it fast....
  6. #4
  7. Lord of the Dance
    Devshed Expert (3500 - 3999 posts)

    Join Date
    Oct 2003
    Posts
    3,614
    Rep Power
    1945
    When you do the scan, the file should is assigned a standard name.
    Depending at the scanner, it might be able to store the files directly with the format "<date>_<sequence>".

    From you description, it sounds like the user have to "verify" the name of all scanned document, whether it should keep the automatic/default name or be changed to something else.
    In this case, I will suggest that the files are moved into another storage location to separate what have been handled/"verified" by the user, if possible.
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2013
    Posts
    7
    Rep Power
    0
    Originally Posted by MrFujin
    When you do the scan, the file should is assigned a standard name.
    Depending at the scanner, it might be able to store the files directly with the format "<date>_<sequence>".

    From you description, it sounds like the user have to "verify" the name of all scanned document, whether it should keep the automatic/default name or be changed to something else.
    In this case, I will suggest that the files are moved into another storage location to separate what have been handled/"verified" by the user, if possible.
    The scanners we have are limited on that capability and it would be nice to have more of a static control without worrying what sort of scanners we purchase with our increasingly limiting budget. We also add files manually into the folder once in a while without any specific name and that's when having the manual option is nice to have. We have a tiny Epson scanner on one of the desks and it comes with Epson Event Manager. That thing is great. It is very much like one of the parts of the program that I'm trying to create, unfortunately a lot of its functionality only works with an Epson scanner.

    Basically here's why the auto and manual option has to be present. We're back-logged like it's no one's business, because the transition to electronic format didn't start until a few years ago. So, our staff has to constantly scan the new files, which would use the automatic scanning and then they would bring in a batch of old files, which they would manually rename to the date of those files.

    Verification of files by the user is something I want to limit. This is why this program is being created so that only once in a while something has to be messed with. If the issue arises where the program automatically renames files, then I can work in a specific procedure where a user renames a file as "000" or something along those lines before they insert into the directory. When the program sees 000, they know that it's the specific manual file insertion that overrides automatic renaming. After the file is inserted, the user can rename, and at that point the program sees a change, rather than creation, and, therefore, does not rename the file... but simply updates the database.

    We're trying to save on paper. We end up going through about 10,000 documents a year and things can get expensive as far as paper and ink goes. Having a database with an ability to query for specific documents and send them out to clients will save us a lot of man hours and money. Unfortunately, we don't have the money to pay someone to build it, so here's where I come from. One step at a time, I'll make it happen, but getting started is the most important part in my book.
  10. #6
  11. Lord of the Dance
    Devshed Expert (3500 - 3999 posts)

    Join Date
    Oct 2003
    Posts
    3,614
    Rep Power
    1945
    So the user will scan the files into a folder at their clients and then move it into the folder being watched?

    I think it will be easier to have two folders watched. One that should rename the files automatically and one for manual name.
    For the manual folder, you could have a text (.txt) file where the user can write the manual name before copying the files into it. Then your program will read that file when it find a file and rename accordingly.

    The current sequence number for the name can be stored in database.

    When the files have been renamed, you should move the files from those two folders into a third storage folder. You should try to find a structure where you don't end up with 10.000 files in one single folder.

    For a start, get the Visual Studio and take a closer look at the FileSystemWatch and then work on creating a windows service that can do the the automatic (re-)naming.
  12. #7
  13. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2013
    Posts
    7
    Rep Power
    0
    Originally Posted by MrFujin
    So the user will scan the files into a folder at their clients and then move it into the folder being watched?

    I think it will be easier to have two folders watched. One that should rename the files automatically and one for manual name.

    No, the files will be scanned into the watched directories. There's just two types of files (current ones and old ones). The old ones would have to be named by manual settings. So the user would select a manual check box, put in the name and begin scanning files. All files that are being scanned will be using that naming convection with the sequence number. If new files are being scanned, they don't need to be named as old dates, they can use the current date format, so automatic check box would be selected and scanning would begin.

    Originally Posted by MrFujin
    For the manual folder, you could have a text (.txt) file where the user can write the manual name before copying the files into it. Then your program will read that file when it find a file and rename accordingly.

    The current sequence number for the name can be stored in database.
    I would prefer having them being able to type the name in a text field within the program's options/settings screen. The sequencing would occur based on the last sequence number of the same name. So, if there's a file in the database that is named 08102013_42, the program would know to name the next file going in as ~_43.

    Originally Posted by MrFujin
    When the files have been renamed, you should move the files from those two folders into a third storage folder. You should try to find a structure where you don't end up with 10.000 files in one single folder.

    For a start, get the Visual Studio and take a closer look at the FileSystemWatch and then work on creating a windows service that can do the the automatic (re-)naming.
    As far as folders, that is true... Having a folder with thousands of files can be weird (but that's what we have right now). I'll probably have to create folders based on dates. So, maybe a good way to face the naming and folder issue at the same time would be to have a folder created with the name of the file names. So, any 08102013_# files would be stores in 08102013 folder. One problem though is the fact that all those directories would have to be monitored. So, I would need to figure how to monitor all the sub-folders for any changes if a file is moved there manually, so that it can be added to the database and renamed if needed.


    I will definitely take an extensive look at FileSystemWatch. I'm watching a ton of videos right now to get the basics of C# out of the way. Found an awesome 30 day course on Youtube, I went through 20 videos in 2 days. Luckily I have a good foundation because of my knowledge of C++ and Java.

IMN logo majestic logo threadwatch logo seochat tools logo