Game:

The Spam Club

» The Spam Club - Life, The Universe and Everything - Comics - Best file format?
ReplyNew TopicNew Poll
» Multiple Pages: 123

Best file format?

Posted at 15:30 on August 24th, 2006 | Quote | Edit | Delete
Avatar
Admin
Reborn Gumby
Posts: 8267
As you can see, the comics are all just a bunch of image files and a HTML page to display them. Would you prefer them as PDF? Probably a more 'elegant' format, but also less flexible.
-----
Now you see the violence inherent in the system!
Posted at 16:29 on August 24th, 2006 | Quote | Edit | Delete
Avatar
Member
Student Gumby
Posts: 34
Uh! I almost forgot about question of file format!
:)
Like you said, pdf is a more 'elegant' format, but also less flexible; and much bigger files I presume, which is important to only us on dial-ups.

Nevertheless, I think that "Masters of The Universe" would look much cooler in PDF.
(In a sense that pdf format makes impression more similar to book or a comic) :)

Edited by niko32 at 00:31 on August, 24th 2006
-----
I’ve seen things you people wouldn’t believe. All those moments will be lost in time like tears in rain. Time to die.
Posted at 02:04 on August 25th, 2006 | Quote | Edit | Delete
Avatar
Admin
Reborn Gumby
Posts: 8267
As for the file formats, PDF isn't really larger. I've converted a few of the comics, and it's almost exactly the same size as the collection of images.

The problem is rather that some scans (especially those of the minicomics) are of rather bad quality (pages turned into wrong directions, specled images). If I did anything on them, it would require quite a lot of work to create at least bearable PDFs out of them and in the end, a few would certainly have to be scanned again.

It's almost no work at all for the better scans, though, so I might replace those in the near future.
-----
Now you see the violence inherent in the system!
Posted at 06:15 on August 25th, 2006 | Quote | Edit | Delete
Avatar
Member
Retired Gumby
Posts: 713
PDF is not good. It's more "elegant", but usually has lower quality (note:This is based on other people's reports, I haven't noticed a significant difference in quality myself), is difficult to extract the images from unless you have the full acrobat program (not just the reader), and doesn't let you manipulate the images in any way (like straightening out crooked ones, which you can currently do by opening them in any image veiwer, correcting them, then sticking them back in the archive).
Instead, I would suggest linking to some sequential image viewers from the main comics page. Programs like Cdsiplay can take an archive of images, and display them, much like they were a PDF file. The main difference is that the images are still in an archive, which can easily be opened by other programs. Cdisplay also has "custom formats associated with it, but these are actually just normal archive formats renamed (so that you can associate image archives with the prog, without associating all archives with it).
Some progs that can read archives:
WINDOWS
-------------
Used Myself
+++++++++++
BDZ explorer http://english.fbsoft.org/
Official CDisplay http://www.geocities.com/davidayton/CDisplay
Unofficial CDisplay http://www.cdisplay.net/
Open soruce Cdisplay http://sourceforge.net/projects/cdisplayex
Picwalker http://www.oma-penny.com/software.php
+++++++++++
Comic Rack http://comicrack.blogspot.com/
=================
LINUX
-----------
Scansreader http://colas.nahaboo.net/software/scansreader/
Comix (python/gtk) http://comix.sourceforge.net/
CBview (perl/gtk) http://elvine.org/code/cbview/
============
MAC
--------
FFview http://www.feedface.com/projects/ffview.html
ComicBookLover http://www.comicbooklover.com/
===========
CROSS-PLATFORM
-------------------------
Comical http://comical.sourceforge.net/
JAVA
---------
Jomic http://jomic.sourceforge.net/
Comic Viewer http://home.asparagine.net/software/comicviewer/
*The python and perl linux ones are also arguably croos-platform.

More info and progs can be found here:
http://sketchyorigins.com/comics/forumdisplay.php?f=61

Another alternative would be the DJVU format. It's similar to PDF, but gets better quality/compression. The main drawback is that people will half to download a new prog/plugin to view them (whereas most people already have PDF, and the archive viewers are optional).
-----
At the end of the day, you're left with a bent fork & a pissed off rhino.
Posted at 07:33 on August 25th, 2006 | Quote | Edit | Delete
Avatar
Admin
Reborn Gumby
Posts: 8267
Thanks, I'll look into those alternatives.

Re. 'disassembling' PDF files: Imagemagick (free) does that with a single command (just like creating PDF from multiple images), so no problems there.
-----
Now you see the violence inherent in the system!
Posted at 08:07 on August 25th, 2006 | Quote | Edit | Delete
Avatar
Member
Retired Gumby
Posts: 713
The images can be taken out of the pdf, but you need some form of special software (most people who don't have a special need for it, don't already have imagemagick, so they'd have to download it just for this), and the resulting images are usually of much lower quality than they should be. Then there's the issue of putting images back into the pdf after altering them. Unless you have ghostscript, few progs other than the full version of acrobat, can create pdfs. Here are a few discussions on the subject:
http://www.sketchyorigins.com/comics/showthread.php?t=6907
http://www.sketchyorigins.com/comics/showthread.php?t=3223
http://sketchyorigins.com/comics/showthread.php?t=8547
http://www.sketchyorigins.com/comics/showthread.php?t=5566

re DJVU: The comic which is a 4.5meg rar file can be made into a 2.9meg djvu file while still retaining good quality.
One advantage to djvu is that files still remain readable, even at very low quality settings. At high compression, jpeg creates a lot of artifacts around sharp edges (like text), which make it very difficult to read. Djvu on the other hand, seeks out and emphasizes edges.
[ur]http://djvuzone.org[/url] and http://planetdjvu.com are the best places to start for djvu info. They can be viewed with xnview or irfanview (and i think imagemagick as well), although not very well (they display djvu images at pixel size instead of print size, you need to zoom out a bit to make it look right). The best way to view them is with windjvu (there's a linux and mac version too), djvu solo 3.1 (also the best way to create them (at least the best freeware way), or with a browser plugin. Unlike archives, dvju files can be read online. Unlike pdfs, djvu files don't have to be fully downloaded before they'll appear in the browser. The first page will show as soon as it's done, then the second page will load in the background.
I took a page from your archive, straightened it, then made djvu images of it at 2 different quality settings:
http://drahken.t35.com/star_stars03b.djvu 198k
http://drahken.t35.com/star_stars03b2.djvu 106k
http://drahken.t35.com/star_stars03b2.jpg 326k

Edited by Cypherswipe at 16:21 on August, 25th 2006
-----
At the end of the day, you're left with a bent fork & a pissed off rhino.
Posted at 02:52 on August 26th, 2006 | Quote | Edit | Delete
Avatar
Admin
Reborn Gumby
Posts: 8267
Quote:
http://www.sketchyorigins.com/comics/showthread.php?t=6907
http://www.sketchyorigins.com/comics/showthread.php?t=3223
http://sketchyorigins.com/comics/showthread.php?t=8547
http://www.sketchyorigins.com/comics/showthread.php?t=5566
Now that I've read these four, I still don't see much of an argument against PDF in general. The first thread actually lists quite a lot of tools for PDF handling, and as I said, Imagemagick does it, too - and without quality loss if you set the right options.

I perfectly believe that these other more specialized file types might have a better compression ratio / quality, but the question is always availability. The question of tools can be split up in two. First of all (and most important), people have to be able to read the comics. I think we all agree that PDF is the winner there. Most people would have to download a special tool for one of the other formats, and the viewers are even only available to way fewer platforms in general. Second, there is tools for processing, as we already touched before. As you said correctly, many people might not have tools like Imagemagick (which I frankly can't understand, because it's the single most useful image processing program I've ever seen) installed, so they have to download one. That's the same for all the mentioned formats, though, and also there's the question how many people would want to change anything on the files (my guess: few).

I found one argument pretty convincing, though. If I were to recompile the images I have now into a PDF, I would have to recompress already pre-compressed files, i.e. running them through a lossy filter a second time. This is probably not the greatest way...
-----
Now you see the violence inherent in the system!
Posted at 08:56 on August 26th, 2006 | Quote | Edit | Delete
Avatar
Member
Retired Gumby
Posts: 713
An argument in favor of any of the viewers over imagemagick is that they're much smaller. Another flaw with image magick is that it relies very heavily on commandlines, and only linux people like commandlines (indeed, most non-linux people run in fear from commandline progs). I know that imagemagick has some GUI abilities, but they're limited at best. There are some 3rd party frontend interfaces for imagemagick, but then people would have to install 2 progs just to get the functionality of a single prog.
-----
At the end of the day, you're left with a bent fork & a pissed off rhino.
Posted at 02:12 on August 27th, 2006 | Quote | Edit | Delete
Avatar
Admin
Reborn Gumby
Posts: 8267
Yes, you're right, I can't force anyone to use the easiest solution if they prefer complicated and mediocre ones.

Anyway, back to the topic: You're still mixing up viewers ('must-have') and manipulation tools ('nice to have'). I.e. you're comparing apples with oranges.
-----
Now you see the violence inherent in the system!
Posted at 08:57 on August 30th, 2006 | Quote | Edit | Delete
Avatar
Member
Retired Gumby
Posts: 713
Here's a comparison of PDF vs DJVU: http://www.planetdjvu.com/djvu_digital_vs___super_hero__pdf.htm Doesn't say whether the guy used the open source prog or the lizardtech prog though.
-----
At the end of the day, you're left with a bent fork & a pissed off rhino.
Posted at 11:09 on August 30th, 2006 | Quote | Edit | Delete | Delete Attachment
Avatar
Member
Baby Gumby
Posts: 1
Quote:
Posted by Cypherswipe at 16:07 on August, 25th 2006:

The images can be taken out of the pdf, but you need some form of special software [...], and the resulting images are usually of much lower quality than they should be. Then there's the issue of putting images back into the pdf after altering them. Unless you have ghostscript, few progs other than the full version of acrobat, can create pdfs.


I also prefer CBZ over PDF, nevertheless here are a few remarks in defense of PDF:

While it is true that you need special software to work with images in PDF files, you can get everything you need for free, open source, and platform independent. PDFBox is a general library to work with PDF documents, which also includes a simple command line tool to extract images. I've integrated it in the Jomic comic viewer (and converter) you mentioned above. See the attached screenshot on how to convert CBZ to PDF or the other way around using Jomic. There is also a menu item to extract all images of a (PDF) comic to a directory.

Technicall speakingy, PDF documents are files with segments for text, images and other stuff (like for example forms). They support a limited set of image formats, where TIFF and JPEG/JFIF are of practical relevance.

Because of that, you can include/extract JPEG data without applying its lossy compression algorithm again and again. Software simple can do a 1:1 copy of the JPEG stream from one file to the other without actually decompressing it.

The size of PDFs should be comparable to ZIP provided that TIFF images use compression. This compression is lossless (similar to ZIP), so it is okay to include/extract them using for example PNG images on the other side (which are also lossless).

Of course, developers can screw up and extract/encode JPEG data or disable compression for TIFF images. So in theory, you can end up with degenerating image quality or with bloated files. For all I know PDFBox handles this stuff properly, and Jomic doesn't mess with it either.

Still, I also noticed that some PDFs contain extraordinarily sucky images, especially product presentations you can download from vendor sites. This is not the fault of the PDF format rather then the document author (or default settings of the PDF generator software used).

Hope that sheds some light on this.

Edited by roskakori at 19:11 on August, 30th 2006
Attachment: *****
Posted at 02:02 on August 31st, 2006 | Quote | Edit | Delete
Avatar
Admin
Reborn Gumby
Posts: 8267
Quote:
Still, I also noticed that some PDFs contain extraordinarily sucky images, especially product presentations you can download from vendor sites. This is not the fault of the PDF format rather then the document author (or default settings of the PDF generator software used).
Exactly. In fact, the author of the article on Planet DjVu Cypherswipe linked to comes to the same conclusion in his bottom table: "Gold-in -> Gold-out".

After some consideration, I guess it wouldn't be worth the effort converting the old scans to any new format. At the end of the day, a collection of images should be fine, and as I said before, many would have to be heavily retouched anyway if I were to touch them, amounting to more work than just scanning everything again. And that's something I don't really want to do.

Thanks for all the input on file formats, though, I can use this knowledge for myself in any case :D
-----
Now you see the violence inherent in the system!
Posted at 05:09 on August 31st, 2006 | Quote | Edit | Delete
Avatar
Member
Retired Gumby
Posts: 713
I still think it'd be a good idea to list some of those readers on the comics page though, give people a more convenient option then extracting the archive and using a webbrowser.
-----
At the end of the day, you're left with a bent fork & a pissed off rhino.
Posted at 05:21 on August 31st, 2006 | Quote | Edit | Delete
Avatar
Admin
Reborn Gumby
Posts: 8267
Sure, I'll do that :)
-----
Now you see the violence inherent in the system!
Posted at 10:07 on August 31st, 2006 | Quote | Edit | Delete
Avatar
Member
Retired Gumby
Posts: 713
Been tinkering around with DJVU some more (I have a certain facination with the format), and I found that it works extremely well with the ladybird books (and probably the early minicomics as well, although the scans of those are probably too small to turn out well). Not only does it result in a file only 2/3 the size of the original, but it actually looks marginally better than the original. Copare this file to the original ladybird:asteroids archive: http://rapidshare.de/files/31445152/ladybird_asteroids.djvu.html (best viewed in a dedicated DJVU viewer).

For those who want a quick sample of the visual difference:
http://allspark.net/cypherswipe/ladybird_asteroids-DJVUsample.jpg
vs the original: http://allspark.net/cypherswipe/ladybird_asteroids04.jpg

Having large chunks of text and seperate images (especially painted-type images like in the ladybird books, as opposed to more graphic-type images like in comics) suits the DJVU format extremely well.

Edited by Cypherswipe at 18:24 on August, 31st 2006
-----
At the end of the day, you're left with a bent fork & a pissed off rhino.
Posted at 06:18 on September 1st, 2006 | Quote | Edit | Delete
Avatar
Admin
Reborn Gumby
Posts: 8267
There is indeed a visual difference: the original image files are a lot sharper. While this slight blurring suits the images very well indeed, the text gets less readable at the same time.
-----
Now you see the violence inherent in the system!
Posted at 14:55 on September 6th, 2006 | Quote | Edit | Delete
Avatar
Admin
Reborn Gumby
Posts: 8267
I experimented with all the formats you listed above quite a bit now. The options I like are:

-compressed archive accessed with some frontend (Comix seems to be nice)
-PDF

Sorry, but I can't get head or tail of DJVU. Apart from sporting a very annoying name, I strongly dislike that basically only the readers are freely available while the 'encoders' are a huge secret. Keeping algorithms a secret is never a good idea.

PDF, in spite of larger file size, doesn't have these disadvantages. In fact, PDF/A is even a certified ISO standard. Image quality is excellent if the source files are good enough.

While I personally tend to agree archived images are the way to go, I suspect PDF would prove to be a lot more popular with most people, simply because of what people are used to and the availability of the viewer software. Let's face it: only very few people are interested in retouching images and the average attention span is so low that an 'unknown' format can turn people away very quickly.

A big problem with PDF is that many people these days tend towards not actually downloading and storing them, but to view them in their browsers every time they want to read a document. For this reason, I'm never offering PDF files directly on the site, but I'm running them through RAR (in spite of that not actually compressing the file).

The question is now: What will people do if I just offer two versions of the same scan (PDF and bunch of images)? Experts and halfway intelligent people will choose the one they want, but what about all the idiots running loose on the net? Will they blindly download both, costing me twice as much?

I also experimented with many filters to improve image quality in the scans. Coming along nicely, I got rid of all those speckles without interfering with text much. What I would like to do is also get rid of the back of pages shining through. The closest I got is a white-value filter which just turns every pixel brighter than value X to white. This works very well on individual pages, but it's not suited for bulk application. I'm not willing to retouch pages individually. Does anyone have an idea or is it too unclear what I'm talking about?
-----
Now you see the violence inherent in the system!
Posted at 09:46 on September 7th, 2006 | Quote | Edit | Delete
Avatar
Member
Retired Gumby
Posts: 713
I'm still against PDF, although it doesn't really matter since I already have all the comics.
The issue of people reading PDFs instead of downloading them is a problem, however I think putting the PDFs inside RAR files defeats the purpose of using PDFs in the first place. If people have to unzip it before they can read it, they might as well just grab the images.

I like the cdisplay archive reader, with pic walker in second place, but they're all roughly equal. It's mainly a question of which features you need and which you can do without.

I agree that the situation with DJVU encoders is bullshit, although the djvulibre situation is quite similar to the ghostscript one for PDFs. My main fascination with DJVU is it's potential rather than it's reality (unfortunately, it looks to be a long time before that potential becomes reality).

I think the best solution is to go with the archive readers, but make it clear that they are an optional way to view the archives, and that people can still view the archives without having download any new software.

I don't think very many people will download both. The nice thing about idiots is that they tend to be lazy, downloading 2 files for one comic would be too much effort. I think most will either close their eyes and pick one, or get frustrated and leave.

I've tinkered with trying to rid images of such noise before, and using white level filters combined with gamma and black level filters (and often saturation settings, if the colors wind up washed out) is really the only method to clean them up, aside from manually painting out the noise.

As far as people downloading both pdf and rar/cbr, what about this: Put a single download link per comic on the comic page. On the next page, put 2 form buttons. Have them say "[download pdf format] OR [download cbr format]", and have it so that clicking one will disable the other. That should make it clear to even an idiot that they only need one, and it'll make them less likely to download both if they have to go back to the comics page and go through all the steps a second time to get the other one.

Edited by Cypherswipe at 12:57 on September, 08th 2006
-----
At the end of the day, you're left with a bent fork & a pissed off rhino.
Posted at 19:53 on November 7th, 2006 | Quote | Edit | Delete
Avatar
Member
Retired Gumby
Posts: 713
I realise there's probably no one else interested in it, but I've been tinkering around with DJVU some more. If you use the photo setting instead of scanned, the image doesn't look any different than the original JPG, but is 20~40% smaller. (The size doesn't change much (if at all) on individual images, but really shows up when you encode a whole issue.)
-----
At the end of the day, you're left with a bent fork & a pissed off rhino.
Posted at 07:28 on November 8th, 2006 | Quote | Edit | Delete
Avatar
Admin
Reborn Gumby
Posts: 8267
Oh, I'm interested in it alright. I just don't really have the time to experiment myself these days. However, I'm sucking all this information up and storing it in the back of my head for later use ;)

Anyway, what do you think about the latest scans on the site? I'm fairly satisfied with the quality myself.
-----
Now you see the violence inherent in the system!
-----
Edited by Mr Creosote at 07:48 on November 8th, 2006
» Multiple Pages: 123
ReplyNew TopicNew Poll
Powered by Spam Board 5.2.4 © 2007 - 2011 Spam Board Team