1B - Create an input file

1B - Create an input file

Postby Bob » Mon Sep 17, 2007 6:41 pm

One of the most important parts of my programming past has been being able to properly test an application. Given that the majority of the applications any of us are going to build are going to be fairly straight forward, and often reading from a file of sorts, i pose this problem.

You are going to be given a file that needs parsing. It has entries in the following format:
Code: Select all
LName, FName Birthday SSN Join_Date Years_Of_Experience Wage

Given that your boss is not even given access to this information for confidentiality, you are clearly out of luck. Write an application to create this file with an arbitrary number of entries, developed on the command line, that would be random enough that any common person would not be able to tell it as being artificial. The names and other values should thusly be randomly generated, and unsorted.

A sample entry would be as follows:
Code: Select all
Chatman, Robert 05/10/1983 123-45-6789 03/01/2006 4 35.22

The application should be able to generate this file with something similar to following command line execution (granted that the file name will be different and may not execute on it's own.):
Code: Select all
./makeinput myinputfile.gne 600000

myinputfile.gne is the filename that we want our input file to have
600000 is the number of entries in this file.

I suggest that anyone doing this choose Perl, and try to implement it in at least 2 other languages. My solutions will be up here at the end of the week.
Bob
Site Admin
 
Posts: 252
Joined: Mon Nov 20, 2006 12:24 am
Location: San Jose California

Re: 1B - Create an input file

Postby Bob » Mon Sep 17, 2007 7:03 pm

As a checklist/suggested route this application should:

  • Confirm valid command line input parameters
    • Given no input, bounce out with the usage echoed.
    • Create a new file if it doesnt exist
      • Confirm if the file does exist that we want to overwrite it and drop out if we dont want to(OOPS!)
  • Make the required entries in the file, all new line separated.
Bob
Site Admin
 
Posts: 252
Joined: Mon Nov 20, 2006 12:24 am
Location: San Jose California

Re: 1B - Create an input file - Perl

Postby Bob » Wed Sep 19, 2007 2:46 pm

Command Line:
Code: Select all
C:\Projects\Perl\Gneu Challenges\1B - Input File>alpha.pl input.file 200000
input.file
C:\Projects\Perl\Gneu Challenges\1B - Input File>alpha.pl input.file 200000
input.filethe file [input.file] exists already. Would you like to overwrite the other file? [YES] no

C:\Projects\Perl\Gneu Challenges\1B - Input File>alpha.pl input.file 2
input.filethe file [input.file] exists already. Would you like to overwrite the other file? [YES] yes

C:\Projects\Perl\Gneu Challenges\1B - Input File>alpha.pl input.file 200000
input.filethe file [input.file] exists already. Would you like to overwrite the other file? [YES] no

C:\Projects\Perl\Gneu Challenges\1B - Input File>alpha.pl input.file 200000ee
input.fileUsage : C:\Projects\Perl\Gneu Challenges\1B - Input File\alpha.pl path/to/inputFile entryCount
        path/to/inputFile       A valid file name or path to one
        entryCount              integer number of entries in the file.
C:\Projects\Perl\Gneu Challenges\1B - Input File>alpha.pl input.file 200000
input.filethe file [input.file] exists already. Would you like to overwrite the other file? [YES] yes

C:\Projects\Perl\Gneu Challenges\1B - Input File>


Application:
Code: Select all
#!/usr/bin/perl -w
#
# Create an input file.
#
# $0 $inputFile $entryCount
################################################################################

## Standard practice - keep things cross platform and the like
use strict;
use warnings;

## Grab the input variables out of the command line list
my ($inputFile, $entryCount) = @ARGV;
my ($input) = ();
my (@Fnames, @Lnames) = ();

## Do command line validation
## the inputFile path may contain backslashes (\) which do not work on Unix,
## while forward slashes (/) work on both os's. Swap them out
$inputFile =~ s#\\#/#g;

## Confirm  entryCount is an integer
##          inputFile is a valid filename (no special characters)
(print "Usage : $0 path/to/inputFile entryCount
\tpath/to/inputFile\tA valid file name or path to one
\tentryCount       \tinteger number of entries in the file."), exit 0 unless ( $entryCount =~ /^\d+$/ && $inputFile !~ /[^a-z0-9\.\-_]/g);

## Handle edge case of when file exists...
## Check to see if the file exists and perform accordingly
if (-e $inputFile)
{
    print "the file [$inputFile] exists already. Would you like to overwrite the other file? [YES] ";
    $input = <STDIN>;

    exit if ($input !~ /^\s*YES\s*$/i);
}

## Load Appropriate Arrays

@Fnames = qw(Bob Michael Mike Steve Andre Tony Jeremy Anthony Oleg Olga Tonya Stephenie Matt Andrew Barbara Blake Brian Bruce Chris Sean Joel);
@Lnames = qw(Stevens Chatman Spolsky Young O'Brien Smith Grant Aaron Bonds Chiu Chong James Lee McBride Hedges Hitchens Dawkins Gray Thomas Garant Hosslinger);

## Open file for writing
open OUTFILE, ">$inputFile" or die "Couldnt open file [$inputFile] - $!";

for (1..$entryCount)
{
    # LName, FName Birthday SSN Join_Date Years_Of_Experience Wage
    printf OUTFILE "%s, %s %02d/%02d/%4d %03d-%02d-%04d %02d/%02d/%4d %2d %.2f\n",
        $Lnames[int(rand(@Lnames))],
        $Fnames[int(rand(@Fnames))],
        int(rand(11)) + 1, int(rand(28)) + 1, int(rand(20)) + 1967,
        int(rand(200)) + 100, int(rand(99)), int(rand(9500)) + 500,
        int(rand(11)) + 1, int(rand(28)) + 1, int(rand(13)) + 1994,
        int(rand(11)), rand(45);
}
## Close file, exit
close OUTFILE;


The file that was created by executing this, with 2000 entries is attached
Image
Attachments
input.txt
File contains 2000 entries of the input file program. Use it for reference to see how you could have yours set up.
(112.96 KiB) Downloaded 501 times
Bob
Site Admin
 
Posts: 252
Joined: Mon Nov 20, 2006 12:24 am
Location: San Jose California

Re: 1B - Create an input file - Ruby

Postby Bob » Thu Sep 20, 2007 7:03 pm

Command Line:
Code: Select all
C:\Projects\Ruby\Gneu Challenges\1B - Input File>alpha.rb close.txt 222
the file [close.txt] exists already. Would you like to overwrite the other file? [YES] yes

C:\Projects\Ruby\Gneu Challenges\1B - Input File>alpha.rb close.txt 223000
the file [close.txt] exists already. Would you like to overwrite the other file? [YES] yes

C:\Projects\Ruby\Gneu Challenges\1B - Input File>dir
 Volume in drive C has no label.
 Volume Serial Number is E4D7-A071

 Directory of C:\Projects\Ruby\Gneu Challenges\1B - Input File

09/20/2007  06:37 PM    <DIR>          .
09/20/2007  06:37 PM    <DIR>          ..
09/20/2007  06:55 PM             2,158 alpha.rb
09/20/2007  07:02 PM        12,894,626 close.txt
               2 File(s)     12,896,784 bytes
               2 Dir(s)   4,276,465,664 bytes free

C:\Projects\Ruby\Gneu Challenges\1B - Input File>


Application
Code: Select all
#!/usr/bin/ruby
################################################################################

## Do command line validation
## Confirm  entryCount is an integer
##          inputFile is a valid filename (no special characters)
if ( ARGV[0] =~ /[^a-z0-9\\\.\-_]/ || ARGV[1] !~ /^\d+$/ )
    print "Usage : " << $0 << " path/to/inputFile entryCount\n"
    print "\tpath/to/inputFile\tA valid file name or path to one\n"
    print "\tentryCount       \tinteger number of entries in the file.\n"
    Process.exit
end

inputFile = ARGV[0]
entryCount = ARGV[1]

## the inputFile path may contain backslashes (\) which do not work on Unix,
## while forward slashes (/) work on both os's. Swap them out
inputFile = inputFile.gsub(/\\/, '/');

## Handle edge case of when file exists...
## Check to see if the file exists and perform accordingly
if (File.exist?(inputFile))
    print "the file [" << inputFile << "] exists already. Would you like to overwrite the other file? [YES] "
    input = STDIN.gets

    exit if (input !~ /^\s*YES\s*$/i)
end

## Load Appropriate Arrays
FNames = ['Bob', 'Michael', 'Mike', 'Steve', 'Andre', 'Tony', 'Jeremy', 'Anthony', 'Oleg', 'Olga', 'Tonya', 'Stephenie', 'Matt', 'Andrew', 'Barbara', 'Blake', 'Brian', 'Bruce', 'Chris', 'Sean', 'Joel']
LNames = ['Stevens', 'Chatman', 'Spolsky', 'Young', "O'Brien", 'Smith', 'Grant', 'Aaron', 'Bonds', 'Chiu', 'Chong', 'James', 'Lee', 'McBride', 'Hedges', 'Hitchens', 'Dawkins', 'Gray', 'Thomas', 'Garant', 'Hosslinger']

## Open file for writing
File.open(inputFile, "w") do |OutFile|
    Integer(entryCount).times do |i|

        # LName, FName Birthday SSN Join_Date Years_Of_Experience Wage
        OutFile.print "%s, %s %02d/%02d/%4d %03d-%02d-%04d %02d/%02d/%4d %2d %.2f\n" % [
            LNames[Integer(rand(LNames.length))], FNames[Integer(rand(FNames.length))],
            Integer(rand(11)) + 1, Integer(rand(28)) + 1, Integer(rand(20)) + 1967,
            Integer(rand(200)) + 100, Integer(rand(99)), Integer(rand(9500)) + 500,
            Integer(rand(11)) + 1, Integer(rand(28)) + 1, Integer(rand(13)) + 1994,
            Integer(rand(11)), rand(45)
        ]
    end
end


The file it produces is comparable to the one produced by the Perl application so i wont bother attaching the output, although here is a sample:

Code: Select all
Hosslinger, Anthony 05/09/1978 240-95-9687 07/06/2002  7 11.00
James, Chris 01/11/1968 273-19-6385 03/13/2001  2 10.00
Lee, Bruce 08/26/1980 129-04-4164 10/06/2006  6 40.00
Bonds, Stephenie 05/26/1984 167-18-2557 06/10/2005  6 24.00
Grant, Bob 08/19/1980 174-55-4990 07/09/2002  0 1.00
Hosslinger, Brian 02/07/1981 278-40-5153 02/05/2004  0 17.00
Hosslinger, Stephenie 03/02/1979 202-52-2087 09/11/1995  2 12.00
Garant, Steve 03/20/1982 169-32-3012 08/16/2002  0 27.00
O'Brien, Chris 02/02/1978 290-87-6894 09/15/1999  5 35.00
Hitchens, Blake 09/26/1973 253-77-0746 11/26/1998  3 13.00
Aaron, Andrew 03/06/1977 222-34-3197 07/24/1994  2 10.00
Thomas, Chris 07/25/1979 223-67-5885 05/20/2003  7 15.00
Chatman, Joel 09/02/1977 146-67-2721 07/23/2000  2 33.00
James, Oleg 02/18/1983 202-75-9818 01/13/1995  3 44.00
Spolsky, Stephenie 10/15/1982 162-30-6704 06/17/2004  4 20.00
Hedges, Tonya 10/12/1968 290-67-4334 11/12/2005  6 44.00
Chatman, Blake 07/21/1967 168-11-7592 06/15/2003 10 16.00
Gray, Blake 06/17/1986 115-48-3740 07/08/1997  5 43.00
Young, Jeremy 03/01/1967 298-06-4920 11/11/2005 10 9.00
Garant, Mike 04/17/1985 298-49-2277 10/17/1996  5 43.00
Dawkins, Barbara 04/15/1981 295-25-5961 03/11/2006  4 15.00
Spolsky, Oleg 01/04/1969 269-56-1370 03/06/2005  5 4.00
O'Brien, Bob 02/01/1973 268-04-3697 01/27/2005  9 3.00
Garant, Stephenie 07/03/1982 265-81-2855 07/14/2005 10 22.00
Young, Jeremy 07/25/1977 180-30-5225 01/20/2001  3 13.00
Smith, Jeremy 08/25/1978 237-64-0810 03/26/2006  3 20.00
Dawkins, Chris 02/14/1967 236-97-9606 05/05/2006  5 28.00
O'Brien, Michael 04/05/1978 119-55-6255 09/06/1998  0 2.00
James, Andrew 10/15/1980 125-59-7131 01/10/2001  1 30.00
Chatman, Olga 11/15/1978 260-63-3382 11/22/1998  7 21.00
Hedges, Steve 11/04/1969 160-01-0805 04/24/2000 10 19.00
Stevens, Tony 09/07/1969 287-50-3328 05/13/1994  1 36.00
O'Brien, Blake 07/22/1980 102-57-0503 10/28/2000  4 30.00
Bob
Site Admin
 
Posts: 252
Joined: Mon Nov 20, 2006 12:24 am
Location: San Jose California


Return to General Discussion

Who is online

Users browsing this forum: No registered users and 2 guests

cron