Welcome To The Home Of The Visual FoxPro Experts  
home. signup. forum. archives. search. google. articles. downloads. faq. members. weblogs. file info. rss.
 From: Pete Sass
  Where is Pete Sass?
 Marathon, Ontario
 Canada
 Pete Sass
 To: Ali Koumaiha
  Where is Ali Koumaiha?
 Farmington Hills
 Michigan - Lebanon
 Ali Koumaiha
 Tags
Subject: RE: NEED FOR SPEED STRING PARSING
Thread ID: 416207 Message ID: 416212 # Views: 56 # Ratings: 0
Version: Visual FoxPro 9 SP2 Category: General VFP Topics
Date: Tuesday, December 23, 2014 9:18:02 PM         
   


> > Hi Folks,
> >
> > I posted a thread a couple of days ago about conversion of a Unix file that
> > is in an old Unix system 25 - 30 years old and the original developer is
> > no longer with us. Client knows nothing about the file format and no
> > documentation exists.
> >
> > I have proven that this data files in this system were home spun and follow
> > no standard header and nothing can read the data files. Unix VI and BVI will not
> > read them . . . however I can hexdump the data files containing a you know
> > everything. The test is in the data files, but as one knows so is all the high-bit
> > characters.
> >
> > There is no Windows per-say CR or LF and these have to be added into the data
> > and I have determined that one record is a 216 character string.
> >
> > Now to my current issue . . .
> > I had to write a custom stripper program to walk through the file 1 character at
> > a time and keep it if it is a ascii character "number or letter" and remove it
> > if binary.
> >
> > OK this was not hard to do, but now comes my question. I see no other way to
> > extract the real data and due to the size of the data files 153 Meg, 180 Meg
> > and some over 1 GB in size .... Yikes. The smaller file of 153 Meg the actual
> > testing an each character level means 153 mission checks and decisions to keep
> > the character or delete it.
> > On a 150 Meg binary data file the stripper worked for 16 hours and was 12 % done.
> > So am I looking at 1 week 24X7 do accomplish just phase - I of the stripping?
> >
> > Looking a the 1 GIG plus files am I looking at ?? 6 weeks ??
> >
> > I am running on a Dell OptiPlex with i7 3rd Generation processor with no other load
> > and 16 GIG of memory.
> >
> > ANY IDEAS AT ALL ABOUT THIS FOLKS??
> >
> > The core of the stripper is a For loop where lnBitLength = 153,000,000 characters in total to test.
> > I turned off the Message and starred this out and this made little if any difference in the speed.
> >
> >
> >
FOR lnX = 1 TO lnBitLength
> > 
> >     lnPC = lnx / lnBitLength * 100
> >     
> >     * --- Hard carriage return counter.
> >     lnCRcounter = lnCRcounter + 1
> > 
> >     lcCurChar  = SUBSTR(lcDataString, lnX, 1)
> >     lcCurChar2 = SUBSTR(lcDataString, lnX - 1, 1)
> >     * --- This is going to perform a step-by-step full rewrite
> >     * --- of the entire string one character at a time.
> >     IF BETWEEN(ASC(lcCurChar), lnAscLow, lnAscHigh) && Is upper/lower alphas or numeric. 
> >           lcGoodInfo = lcGoodInfo + lcCurChar 
> >       
> >         * --- Pattern regoginzer to only add 1 space in between character information.  
> >         IF ASC(lcCurChar) = 32 .and. ASC(lcCurChar) <> 32
> >               lcGoodInfo = lcGoodInfo + " "
> >         ENDIF
> >          
> >     ENDIF 
> >     
> >     * --- Calculated string length per record at 216 characters long.
> >     * --- The only way to find the below record string length is to 
> >     * --- parse a test run of say 4,000 bits and physically count
> >     * --- the total number of character encompassing one record.
> >     IF lnCRcounter = 216
> >         lcGoodInfo = lcGoodInfo + CHR(13) + CHR(10)
> >         lnCRcounter = 0
> >     ENDIF      
> >     
> >     * --- Increment the % counter.
> >     lnPCcounter = lnPCcounter + 1
> >  
> >     * --- Show % status.
> >     lcPC = ALLTRIM(STR(lnPC,12,4))
> >     SET MESSAGE TO "Percent conversion currently at : " + lcPC
> >     * --- lcGoodInfo = lcGoodInfo + " "
> >  ENDFOR

> >
> >
> >
> > Pete "the IceMan", from the Great White North of Canada.
> > www.marathongriffincomputers.com
>
> Since you have to read each record at 216 counter (lnCRCounter = 216)
>
> why don't you read entire record from start to 216 characters substr(x,216) and increment by 216 as a row instead string one by one?
> then, do a strtran() for the chr(32) with a " "
>
> no?
>
> Ez Logic


Hi,

I did not break it up this way as I felt the stepping though each character would be the
same time or longer regardless if I broke the string up into 216 character strings or just
leave as one large string.
However, I am going to try anything at this point in time so will break up and run in
216 string lengths inside the . . . loop
But don't forget it is just not blanks and white spaces that have to be stripped out. Any character
that is not a keyboard code has to be removed and this is a binary file with thousands of
#$@!&*^()^$*( characters in them. You know what I mean . . I call them high-bnit characters others
may call them non-ascii characters.

Pete "the IceMan", from the Great White North of Canada.
www.marathongriffincomputers.com

ENTIRE THREAD

NEED FOR SPEED STRING PARSING Posted by Pete Sass @ 12/23/2014 6:53:08 PM
RE: NEED FOR SPEED STRING PARSING Posted by Ali Koumaiha @ 12/23/2014 7:04:00 PM
RE: NEED FOR SPEED STRING PARSING Posted by Pete Sass @ 12/23/2014 9:18:02 PM
RE: NEED FOR SPEED STRING PARSING Posted by Greg Green @ 12/23/2014 9:32:14 PM
RE: NEED FOR SPEED STRING PARSING Posted by Pete Sass @ 12/23/2014 11:34:52 PM
RE: NEED FOR SPEED STRING PARSING Posted by Greg Green @ 12/24/2014 12:40:30 AM
RE: NEED FOR SPEED STRING PARSING Posted by Pete Sass @ 12/24/2014 1:59:48 AM
RE: NEED FOR SPEED STRING PARSING Posted by Greg Green @ 12/24/2014 4:06:32 AM
RE: NEED FOR SPEED STRING PARSING Posted by Jun Tangunan @ 12/24/2014 1:58:25 AM
RE: NEED FOR SPEED STRING PARSING Posted by Pete Sass @ 12/24/2014 2:48:33 AM
RE: NEED FOR SPEED STRING PARSING Posted by Jun Tangunan @ 12/24/2014 3:08:43 AM
RE: NEED FOR SPEED STRING PARSING Posted by Mike Yearwood @ 12/23/2014 10:26:18 PM
RE: NEED FOR SPEED STRING PARSING Posted by Pete Sass @ 12/24/2014 2:57:22 AM
RE: NEED FOR SPEED STRING PARSING Posted by Mike Yearwood @ 12/23/2014 10:35:06 PM
RE: NEED FOR SPEED STRING PARSING Posted by Pete Sass @ 12/24/2014 3:01:56 AM
RE: NEED FOR SPEED STRING PARSING Posted by Doug Hennig @ 12/23/2014 10:28:14 PM
RE: NEED FOR SPEED STRING PARSING Posted by Pete Sass @ 12/23/2014 11:10:55 PM
RE: NEED FOR SPEED STRING PARSING Posted by Randy Bosma @ 12/26/2014 4:03:13 PM
RE: NEED FOR SPEED STRING PARSING Posted by Chuanbing Chen @ 12/24/2014 12:16:21 AM
RE: NEED FOR SPEED STRING PARSING Posted by Pete Sass @ 12/24/2014 2:06:53 AM
RE: NEED FOR SPEED STRING PARSING Posted by Pete Sass @ 12/24/2014 5:49:20 AM
RE: NEED FOR SPEED STRING PARSING Posted by Mike Yearwood @ 12/29/2014 6:05:14 PM
RE: NEED FOR SPEED STRING PARSING Posted by Pete Sass @ 12/29/2014 7:30:28 PM
RE: NEED FOR SPEED STRING PARSING Posted by Zia Mughal @ 12/24/2014 8:23:15 AM
RE: NEED FOR SPEED STRING PARSING Posted by Pete Sass @ 12/24/2014 4:50:30 PM
RE: NEED FOR SPEED STRING PARSING Posted by Tony Vignone @ 12/24/2014 5:07:32 PM
RE: NEED FOR SPEED STRING PARSING Posted by Pete Sass @ 12/24/2014 5:53:57 PM
RE: NEED FOR SPEED STRING PARSING Posted by Tony Vignone @ 12/24/2014 9:21:34 PM
RE: NEED FOR SPEED STRING PARSING Posted by Tony Vignone @ 12/25/2014 8:31:14 AM
RE: NEED FOR SPEED STRING PARSING Posted by Pete Sass @ 12/25/2014 4:56:51 PM
RE: NEED FOR SPEED STRING PARSING Posted by Tony Vignone @ 12/25/2014 5:57:14 PM
RE: NEED FOR SPEED STRING PARSING Posted by Pete Sass @ 12/25/2014 9:01:03 PM
RE: NEED FOR SPEED STRING PARSING Posted by Tony Vignone @ 12/26/2014 2:51:27 PM
RE: NEED FOR SPEED STRING PARSING Posted by Pete Sass @ 12/26/2014 4:09:19 PM
RE: NEED FOR SPEED STRING PARSING Posted by Tony Vignone @ 12/26/2014 4:25:35 PM