- Joined
- Nov 21, 2001
- Messages
- 83
Hello all, the subject of creating Fortean Times in electronic format has been discussed on a few occasions in the past with particular interest in getting back issues available either for general interest or as a searchable database of some kind.
The copyright clearly belongs to the magazine but I know they are quite stretched to keep the current version of the magazine going so I am proposing ( to them and to the user base ) that we pool our resources to create it for them.
I will speak to the management to see if I can get official approval for this, with the understanding that they will own the end product.
One thing we would need to sort out before scanning a single page to OCR ( ie turning a magazine page into text and photos ) is the way we are going to organise the material. Scanned pages are useless unless they are turned into text, proof read and organized in a way that is readable and which can be indexed.
I would like to hear your thoughts on how to do this, but my initial thoughts are:
1 - Scan in magazines and save them as TIFF files ( multi page, high resolution pictures ). This gives the chance for other people to proof read the resultant text against the original without having to pass around rare back issues.
2 - leave out all the adverts, back issue subscription offers etc and only spend time on the articles / letters and photos.
3 - create one file per article with a name of FT101-P43-UFO where it is issue 101, page 43 and about a general subject of UFOs. More detail on the indexing is to follow.
4 - save photos in the same format, and where there are more than one photo on the same article, just add a number to the end of the name.
5 - indexing. This is the tricky bit as I know there are plenty of knowledge-based systems out there that can do wonders with a load of raw data like this. What I would like to see is a recreation of something like the FT issue index like there exists for the first 60 or so issues ( the book for more may exist ). If I can find the original book with the index then that would be a great start to the project.
Once complete then I would like the project to be able to generate income for FT, whether by selling the data on CDs in blocks of 25 issues or whatever ( a big saving on printing costs for them as they only need a CD burner and CD labels from their local WH Smiths. Think of almost 200 back issues - that is 8 CDs which could sell at around 8 quid each. A nice little earner for FT since CDs are 20p each or so. Not to mention the benefit to the Fortean community when it comes to research...
The feedback I would like from you is how we should organise the data, especially for an index. Please think through any response for viability so we can create something Charles Fort would be proud of
I suspect the can of worms is just opening now...
thanks
Iain
The copyright clearly belongs to the magazine but I know they are quite stretched to keep the current version of the magazine going so I am proposing ( to them and to the user base ) that we pool our resources to create it for them.
I will speak to the management to see if I can get official approval for this, with the understanding that they will own the end product.
One thing we would need to sort out before scanning a single page to OCR ( ie turning a magazine page into text and photos ) is the way we are going to organise the material. Scanned pages are useless unless they are turned into text, proof read and organized in a way that is readable and which can be indexed.
I would like to hear your thoughts on how to do this, but my initial thoughts are:
1 - Scan in magazines and save them as TIFF files ( multi page, high resolution pictures ). This gives the chance for other people to proof read the resultant text against the original without having to pass around rare back issues.
2 - leave out all the adverts, back issue subscription offers etc and only spend time on the articles / letters and photos.
3 - create one file per article with a name of FT101-P43-UFO where it is issue 101, page 43 and about a general subject of UFOs. More detail on the indexing is to follow.
4 - save photos in the same format, and where there are more than one photo on the same article, just add a number to the end of the name.
5 - indexing. This is the tricky bit as I know there are plenty of knowledge-based systems out there that can do wonders with a load of raw data like this. What I would like to see is a recreation of something like the FT issue index like there exists for the first 60 or so issues ( the book for more may exist ). If I can find the original book with the index then that would be a great start to the project.
Once complete then I would like the project to be able to generate income for FT, whether by selling the data on CDs in blocks of 25 issues or whatever ( a big saving on printing costs for them as they only need a CD burner and CD labels from their local WH Smiths. Think of almost 200 back issues - that is 8 CDs which could sell at around 8 quid each. A nice little earner for FT since CDs are 20p each or so. Not to mention the benefit to the Fortean community when it comes to research...
The feedback I would like from you is how we should organise the data, especially for an index. Please think through any response for viability so we can create something Charles Fort would be proud of
I suspect the can of worms is just opening now...
thanks
Iain