Results 1 to 5 of 5

Thread: Perl Script to help feed DBMS MySQL

  1. #1
    Join Date
    Aug 2009
    Posts
    56

    Perl Script to help feed DBMS MySQL

    I started in Perl and I am reading the doc to learn things. My problem is simple, I have multiple files that I just need to manipulate, format to feed my MySQL database. I use MySQL 5.1, Toad Perl v4.1 & v5.10

    This is one example among many other CSV file:
    Code:
    245362;77;77289;INR69907;Felt;799;20091102232406;;VEL;799;66;24;
    243640;77;66017;INT81553;"Notebook";699;20091102232318;;VEL;699;66;24;
    236958;77;08;DRT03286;bed;299;20091102231945;;VEL;299;66;15;
    319771;77;79060;INA38898;Pen;399;20091102231902;;VEL;399;66;3;
    319771;77;79060;BTQ38898;Pen;399;20091102231902;;VEL;399;66;24;
    319250;77;13110;INJ40768;Mattresses;599;20091102231736;;VEL;599;66;24;
    320343;77;21758;PRP31628;Mice;299;20091102231459;;VEL;299;66;24;
    331379;77;27209;INT59377;Kit;299;20091102231319;;VEL;299;66;15;
    314147;77;89301;BTQ12922;Scissors;0;20091102231248;;VEL;0;66;24;
    334467;77;26234;TEL38873;Handkerchief;499;20091102231131;;VEL;499;66;24;
    241749;77;37732;TEL90191;Eraser;699;20091102231104;;VEL;699;66;24;
    The desired result:
    Code:
    insert into vod_billing values("236958","8","DRT03286","bed","299","20091102231945","15","DRT" );
    insert into vod_billing values("241749","37732","TEL90191","Eraser","699","20091102231104","24","TEL" );
    insert into vod_billing values("243640","66017","INT81553","Notebook","699","20091102232318","24","INT" );
    insert into vod_billing values("245362","77289","INR69907","Felt","799","20091102232406","24","INT" );
    insert into vod_billing values("314147","89301","BTQ12922","Scissors","0","20091102231248","24","BTQ" );
    insert into vod_billing values("319250","13110","INJ40768","Mattresses","599","20091102231736","24","INT" );
    insert into vod_billing values("319771","79060","BTQ38898","Pen","399","20091102231902","24","BTQ" );
    insert into vod_billing values("319771","79060","INA38898","Pen","399","20091102231902","3","INT" );
    insert into vod_billing values("320343","21758","PRP31628","Mice","299","20091102231459","24","PRP" );
    insert into vod_billing values("331379","27209","INT59377","Kit","299","20091102231319","15","INT" );
    insert into vod_billing values("334467","26234","TEL38873","Handkerchief","499","20091102231131","24","TEL" );
    it gives this:
    1.concatenate all *.csv* a directory into a single file
    2.replace all characters "by anything (double quote)
    3.Remove all duplicate lines
    4.Sortby the first column
    5. Remove all empty lines
    6. The character is the delimiter column then delete column 2,8,9,10 & 11
    7. To add last column in the first 3 characters of the column 3 (TEL, PRI etc ...)
    8. On this last column, replace all INA INJ & Rs per INT
    9. Insert before the line vod_billing insert into values ( "
    10. To include end of line ");
    11. To replace the delimiter, for ""
    12. Rename the output file .sql

    The ultimate goal would be to directly feed my MySQL database created with lines, but already such a script would be perfect.

    I know it's a bit abused but probably simpler for a developer!

  2. #2
    Join Date
    May 2008
    Posts
    2,297

    Re: Perl Script to help feed DBMS MySQL

    There's a difference between helping someone who starts, and her spinning is all scripts. For referrals, I suggest you to go and read a little help on the functions open(), split(), close(). A quick foreach () will also help you.

  3. #3
    Join Date
    Nov 2008
    Posts
    1,022

    Re: Perl Script to help feed DBMS MySQL

    It's all very easy in perl, but what is (an order of magnitude) the size (number of lines) of a file .csv obtained by concatenation of those listed in a directory, in order to see the strategy most appropriate (while memory or not) to remove duplicates and sorting.

  4. #4
    Join Date
    Nov 2008
    Posts
    1,221

    Re: Perl Script to help feed DBMS MySQL

    Quote Originally Posted by KALANI84 View Post
    it gives this:
    1.concatenate all *.csv* a directory into a single file
    2.replace all characters "by anything (double quote)
    3.Remove all duplicate lines
    4.Sortby the first column
    5. Remove all empty lines
    6. The character is the delimiter column then delete column 2,8,9,10 & 11
    The correct order to do this is:
    1. Concatenate all *. csv * a directory into a single file
    2. Remove all empty lines
    3. The character is the delimiter column then delete column 2,8,9,10 & 11
    4. Sort by the first column
    5. Delete all duplicate lines
    6. Replace all characters "by anything (double quote)

    Because otherwise we may have lines that are not duplicates before deleting columns which become then (I do not know the exact structure of data lines, so it implies) [or when the order initially it is important and can be duplicates after deleting columns?]

    Delete duplicates after sorting is optimal: the duplicates will then successive, which makes things easier.

  5. #5
    Join Date
    Aug 2009
    Posts
    56

    Re: Perl Script to help feed DBMS MySQL

    As stated I started doing the script Via Agenda quote (I added in step 7 & 8)

    Code:
    #!/usr/bin/perl -w
    use strict;
    use warnings;
    # Global Variables
    my $LOCAL_DIR = "/cygdrive/d/Nicolas/VOD/_.billing/_.compiler/backup/test/";
    my $DEST_DIR = "/cygdrive/d/Nicolas/VOD/_.billing/_.compiler/backup/";
    my $SEARCH_FILE = "*_billing_tvod_*";
    my $OUTPUT_FILE = $DEST_DIR . sprintf("RESULT_FILE.TXT" );
    # Script Perl
    print "Starting...\n";
    print "\n";
    open (OUT,">$OUTPUT_FILE" ) or die "Message : $!\n";
    my @RESULT_FILE = <$LOCAL_DIR/$SEARCH_FILE>;
    foreach my $LINE(@RESULT_FILE)
    {
      open IN,"$LINE" or warn "Message : $!\n";
      while(<IN> )
      {
          s/\"//g;
          next if m/^$/;
          my @SPLIT_TAB = split(/;+/);
          push(@SPLIT_TAB,substr($SPLIT_TAB[3],0,3));
          chomp(@SPLIT_TAB);
          if($SPLIT_TAB[11] eq 'INA')
          {
              my $ERASE_COLUMN = pop(@SPLIT_TAB);
              push(@SPLIT_TAB,'INR');
              chomp(@SPLIT_TAB);
          }
          elsif($SPLIT_TAB[11] eq 'INS')
          {
              my $ERASE_COLUMN = pop(@SPLIT_TAB);
              push(@SPLIT_TAB,'INR');
              chomp(@SPLIT_TAB);
          }
          elsif($SPLIT_TAB[11] eq 'INJ')
          {
              my $ERASE_COLUMN = pop(@SPLIT_TAB);
              push(@SPLIT_TAB,'INR');
              chomp(@SPLIT_TAB);
          }
          print OUT "$SPLIT_TAB[0] $SPLIT_TAB[2] $SPLIT_TAB[3] $SPLIT_TAB[4] $SPLIT_TAB[5] $SPLIT_TAB[6] $SPLIT_TAB[10] $SPLIT_TAB[11]\n";
          #print OUT $_;
      }
    }
    print "\n";
    print "Ending...\n";
    END
    I managed tasks 1, 2, 6, 7 & 8 and probably 3.

    For task 3 deletion is not done because I'm selecting some columns, but I think it is not optimized.

    For task 4 I can not include it in the foreach so it'll have that I treat this in a second file from the 45/46 line is that it?

    Stupid question, how I leave my RESULT_FILE.TXT file I created? So to put it in a new table for the lot?

    For task 5 I found a function with an array of hash on the result but one of my columns can be seen truncated. I do not know if it's the function or fate previous:
    sub doublons_grep {
    my %hash_sans_doublon;

    return grep {! $hash_sans_doublon{$_}++ }@_;
    }

    To answer questions of CitricAcid:
    The order of magnitude is about 30000 lines.
    There is no restriction to the elimination of duplication before or after sorting, no matter.

Similar Threads

  1. How to call perl script from Python?
    By Jhonwho in forum Software Development
    Replies: 3
    Last Post: 23-07-2009, 09:59 PM
  2. Is it possible to execute Perl script within another script?
    By RasMus in forum Software Development
    Replies: 2
    Last Post: 21-07-2009, 10:57 PM
  3. Need Perl script to execute from another.
    By beelow in forum Software Development
    Replies: 2
    Last Post: 20-06-2009, 09:17 AM
  4. Title of a Perl CGI script
    By !const in forum Software Development
    Replies: 2
    Last Post: 26-03-2009, 11:43 PM
  5. [DBMS] PostegreSQL and MySQL: whatever you want them?
    By Neil'o in forum Software Development
    Replies: 3
    Last Post: 18-03-2009, 12:29 PM

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Page generated in 1,717,027,668.07777 seconds with 17 queries