Gaffaweb > Love & Anger > 1993-20 > [ Date Index | Thread Index ]
[Date Prev] [Date Next] [Thread Prev] [Thread Next]


half a gigabyte

From: uli@zoodle.robin.de (Ulrich Grepel)
Date: Thu, 10 Jun 93 01:33 MET DST
Subject: half a gigabyte
To: love-hounds@uunet.UU.NET

Ok, here's the promised summary of our ideas so far:

Contents:

- Complete love-hounds archive.

  Organized as mbox files. Superfluous headers removed. Bogus headers from news
  systems sending posted articles to love-hounds@uunet.uu.net removed.

  Messages are grouped in 100 message groups or into a separate file for each
  month. Months are difficult, since they overlap quite a lot. At the moment
  there are 207 such 100-message-files in the archives and 20-21 are missing
  there by now. That makes a total of about 22700 messages.

  An index is added to the files that contains sorted lists of "Subject:" and
  "From:" lines. "Date:" is not needed, since that's the natural order.
  Subindices aren't needed because of the nature of threads being short and
  not too many threads going on at one point in time.

  Full text search would be nice to have. Some systems (e.g. NeXT) already have
  some sort of full text search (e.g. Digital Librarian/Indexing Kit) that might
  be connected to the mail reader software used for reading the archive.
  Specially written or adapted readers would be able to make use of indices.

  Any better idea of searching in the archives is WELCOME. But think of the
  time needed to do such a thing manually. 10 seconds per article result in 

  about 75 hours of work, and 10 seconds is WAY TOO OPTIMISTIC. Only way of
  doing this is sharing the work. One month for any volunteering person?

- pictures.

  Specially scanned pictures plus all pictures availlable at the moment, sorted
  into categories like

  - A scan of every album front and back. Includes all singles and maybe boots?
  - A scan of EXACTLY WHERE each of the hidden KT's are!!
  - A "Family Album" section with Kate and friends through the years
  - A "Scrapbook" section with lots of different KT shots.
  - A KateCon section!
  - A NetFaces section (I suggest 64*64 2 to 24 bit TIFFs (as used by NeXTmail))
  - remaining pics

  The file format of these pictures should probably be GIF. If we really don't
  know how to fill up 600 MB we can add other formats, preferably TIFF and JPEG

  The pictures should confirm to a naming scheme so that we have not that many
  problems describing/adding future pictures.

  If there's any room left I suggest thinking about MPEGs (I'm a Rocket Man...).

- special text files.

  - Cloudbusting
  - Deeper Understanding
  - The Garden
  - Lyrics (with and without annotations. I once tried to translate the lyrics
            into German. I probably was quite unsuccessful. Anyone interested?)
  - Kate's poems
  - Discography (basic, extended, fully extended)
  - Extended FAQ
  - Intro to the Net and to rec.music.gaffa / Love-Hounds
  - All the reviews and newsbits from magazines that can be found in the archives
  - All the rest of the archives.

  The last two should go in ASCII only, but all the others should be availlable
  in more than one format to be printable in a good-looking way. These formats
  don't have to be searchable, there's always the ASCII-version to search in.
  Formats mentioned include:

  - ASCII (always neccessary as a base, universally useable)
  - rtf (NeXT, MS-Windows, MS-Word (also on Macintosh), includes NeXT and 

         MS-Windows help files)
  - LaTeX
  - info
  - WordPerfect
  - MS-Word
  - PostScript
  - AmigaGuide
  - World Wide Web distributed hypertext system (distributed? on a CD?)
  - Whatever we want.

  Some of these formats are editable, searchable, indexable, hypertextable,
  looking nice, and more, some aren't. Most should be able to be generated from
  the others, maybe with some step inbetween.

  As we are talking about 10 to 15 megabytes of text here it should not be
  a problem to include as many formats of text as we want. If it really gets
  too much we might start a) compressing stuff and/or b) using diff to remove
  much of the text. Unfortunately this will inhibit working directly on the
  CD with the formatted texts.

- sound files

  Ideas include important snippets from the work of this woman ( ;-) ). Sound
  format should be decided after looking at the amount of sounds that should
  find a place. I suggest something between mono 8bit 8kHz mulaw and stereo
  16 bit 44.1 kHz.

  Data format will need converters, since it's not good to store sound in
  several different ways. Converters (incl. sampling frequency adaptions) are
  easy to write and/or in the PD.

  It's possible to add a normal audio track to a data CD-ROM. Most audio CD
  players are able to play them as track 2-n. Track 1 is data. Maybe we could
  persuade Kate to let us use "Suspended In GAFFA". (How could we achieve this?)

- software

  Any software that helps reading/watching/listening_to/digesting/working_with
  the rest of the stuff. Preferably in source and binary form(s). Including
  mail readers, picture viewers/converters, sound converters.

- fanzines

  Peter D.F.M.: Would you like to offer the Homeground Archives? That would be
  a very nice addition to the CD, since most of the early mags are difficult
  to find. True colleKTors would still want to have the originals, so that
  the market for old issues won't die down.

  This plea holds for all other fanzines too.

- other archives

  Since these are spin-offs from gaffa I suggest to include the archives of
  the two mailing lists Ecto and Really-Deep-Thoughts. At least if there's 

  room. We might even add the pictures and any other file from them. Of course
  this depends on the availlable space.

- CD data format

  ISO-9660 with 8+3 character upper case file names (just to please EVERYONE).

  To please the rest it would be nice to have some way to have longer and
  mixed case file names. One idea would be to create a tar file that contains
  a huge amount of soft links into the CD and that of course can have 

  longer file names.

- design

  KT-logo on the disc, some suggestion for the cover?

- cost

  anything between 50 $ (for single copies) and 5 $ (for higher quantities)

- needed

  - person with big hard disk that has enough free space to hold the stuff
  - person with access to CD mastering facilities. Preferably as near to the
    first person as possible...
  - person(s) with good scanners
  - person(s) with good OCR software (the fanzines...)
  - a query about the number of people wanting such a CD-ROM
  - volunteers
  - further suggestions

Bye,

Uli