Welcome to Gerald's Silly little AI page. Created 10/14/2001
(Updated to include VIM Runtime and to fix for use with GVIM instead of DOS VIM 10/21/2001)
(Updated (again) to include VIM 6.1 (new version) 7/7/02)
(Updated (once more) to VIM 6.2 6/12/03)
|Welcome! Click the links below to get stuff:
VIM 32bit 6.2 (4.24MB) Windows 95/ 98/ ME/ NT/ 2000/ XP
Self installing Executable.
AWK320.zip (16-bit) (92KB)
AWK95.exe (156KB) for Long File Name Support
|Sure, I know what you're thinking. This is a slapped together web page for the sole purpose of attracting people. OK, so I saw that ai interested people are visiting my home page and so I thought I'd accommodate them.
Here's why you're here:
1) You've been to Greg Leedberg's site (leedberg.com/glsoft)and downloaded DAISY or BILLY.
2) You've then been to the message board and found one of my responses almost interesting.
3) You're interested in dumping a huge file to work with Billy or Daisy.
4) Or, you are a (*ahem*) regular visitor to my site and saw the new link up :-).
5) Or, even stranger, you got dropped here from a search engine! (Click here to go to my home page!)
Let's get to the meat.
Check out my short FAQ
Let's assume you have a text file that you'd like to convert. There are many etexts available in lots of places. Project Gutenberg might be one such source for your needs. After 75 years, literature becomes "public domain," in general, and that is the type of work you will find there.
In fact, let's use Grimm's Fairy Tales (529k) to begin our little test. Never mind the size of the text. We want to convert it to a Daisy file. Oh, and I'm not guaranteeing this method will work for you on any text, including the sample we're using, so play at your own risk and MAKE BACKUPS of original text.
Do you have VIM? If not, get it on the side. First, let's make sure vim is open and, well, essentially dumped into the Windows Directory. You don't have to put it there. You can put it anywhere you want. Just make sure you are able to have it in your path. So, we start with vim grimm10.txt ... Just for your information, what you have is the ORIGINAL Project Gutenberg etext. What our result is going to be SHOULD NOT BE DISTRIBUTED as a project gutenberg text, neither sold or any such. Please obey the copyright.
And now we have disclaimer info, blah blah blah... It's time to kill it. Type the following and press [Enter]. Oh, and use brackets only when not referencing a keypress... (*** NEW INFO! If you're using a UNIX variant, you'll need to use Ctrl-v for literals, but this is CHANGED to Ctrl-q because Ctrl-V is PASTE in Windows, so if you saw this before, it's different now***) :
This kills all the header stuff, including Table of Contents. There's probably a lot of blank lines here we're going to get rid of later. Let's get rid of the annoying ^M's everywhere by replacing them with spaces:
The [Ctrl-q][Enter] are key press combinations. Don't type the brackets. There is no purpose for carriage returns/line feeds for DAISY (yet) so let's not add to our problem.
Now, we get rid of all multiple spaces:
:%s/ / /g
:%s/ / /g
:%s/ / /g
(Until we get PATTERN NOT FOUND:) And then we have no more. There might be a lot of extraneous information (titles, Index) that we might wish to get rid of, because we only want the text. We can handle that later.
We should remove all titles now (this is a "best guess" that all chapters/titles are all caps and have at least 4 caps in a row):
Removing all blank lines between paragraphs:
In this case, we might wish to remove the information on the history of the Grimm Brothers:
It might behoove us to remove apostrophes, quotes, parentheses, if they exist:
There is a special case in this etext, a footnote:
Let's now make a big one-line-file:
Then check for double spaces:
:%s/ / /g
Now we terminate sentences: the "0" here is a zero.
And convert spaces to carriage returns:
We can add Daisy's name to the top and add 0 for don't learn (or 1 for learn):
Use your favorite program, or Windows Explorer, to copy the .dsy file to your daisy directory and change Daisy to load that file.
We're done. You can go somewhere else now, or ask me: email@example.com why it's not working for you.
Oh, and the entire script? Here: Don't forget that [Ctrl-v], [Enter], [Shift-o], and [Esc] are keypresses WITHOUT brackets. Otherwise, use brackets as you see them.