bugbear <bugbear@trim_papermule.co.uk_trim> wrote:
>Floyd L. Davidson wrote:
>> bugbear <bugbear@trim_papermule.co.uk_trim> wrote:
>>> Jim Townsend wrote:
>>>> Canon's early RAW images were 12 bit. The newer cameras
>>>> produce 14 bit CRW images. The are saved as 16 bit
>>>> images.
>>> Ah hah!
>> But when "saved as 16 bit images", the data is
>> re-encoded
>> using 16 bit values. It is *never* padded with zeros.
>>
>>>> There are 12 or 14 bits of actual image data and the
>>>> remaining bits are padded with zeros.
>>>> 8 bits = 255 (Binary 11111111)
>>>> An 8 bit color image is composed of three channels X 255
>>>> 16 bits = 65535 (Binary 1111111111111111)
>>>> A 16 bit color image is composed of three channels x 65535
>>>> DCraw might be able to process 16 bit images, but GIMP
>>>> cannot. GIMP converts 16 bit images to 8 bit images
>>>> when you open them. The value of 255 per channel in
>>>> GIMP's histogram is normal for an 8 bit image. I
>>>> don't know why you think this is high. (Open an 8 bit
>>>> JPEG in GIMP and you'll see the histogram still shows
>>>> 0-255 per channel).
>>> Because I was missing the info you gave me!
>>>
>>> I was expecting 12 (or maybe 14 bit) data, UNPADDED,
>>> which would have given me 4 (or maybe 6) bit data after crunching
>>> down to 8 bit in netpbm (which divides everything by 257)
>> That does not make sense. Regardless of conversion to
>> a
>> 16-bit format, when encoded in an 8-bit file you would have
>> 8 bits, not 4 or maybe 6.
>
>I wasn't counting leading zeroes in my bit counts.
There are no leading zeros.
If you convert 8-bit data to 16-bit data, the highest
number is all 1's in _both_ cases! In the 8-bit data
the maximum value is represented by an unsigned integer
value of 0xff (255 decimal). In binary that is 8 bits
which are all 1's. In the 16-bit data the maximum value
is represented by an unsigned integer value of 0xffff
(65535 decimal). And in binary that is 15 bits which
are all 1's.
>*IF* you take my assumption that the true-raw data is 12 bit,
>(i.e. 16 bit with 4 leading zeroes) *AND*
That is not true though. 12-bit data is just that, data
encoded into 12 bit words. It is *NOT* 16 bit words
with 12 bits used, it is 12 bit words! When the data is
store in a file, whether it is 8-bit, 12-bit, 14-bit or
16-bit, it is _streamed_ into the 8 bit octets that
computer files use. It is *not* zero padded.
When that data is transfered from the camera to a
computer it is done using 8-bit octets for the file
format. Each pair of 12-bit data words is sent as 3
each octets, all totallying 24 bits.
>that dcraw preserves this convention *AND* you simply divide
>the 16 bit data by 257 (which *IS* what pamdepth does) you would *expect*
>the resulting file to be 8 bit with 4 leading zeroes, which
>most people would call 4 bit.
But what /dcraw/ does is convert the 12-bit sensor data
into an image format, using either 8-bit or 16-bit data
for the output. There are NO padding bits, and it is NOT
12-bit data at the output.
If you look at the source code for /pamdepth/ you will
discover that it does *not* simply divide by 257 (the
right number would be 256 anyway, not 257). (Granted
that what it does is very close to that!)
Regardless, dividing by 256 would *not* result in 8-bits
with some number of leading zeros. If you did that the
size of the file would remain exactly the same. But if
you look at the results, you'll see that /pamdepth/ will
change a 16-bit file to an 8-bit file with the
consequence that the file size is reduced by just about
50% (not quite because of the metadata overhead, which
is very small).
>It appears that no less than two of my assumptions are wrong;
>The data (I think) in the CRW file is right padded with zero,
>not left, and in any case it appears that dcraw normalises to
>"full" 16 bit.
It is not zero padded at all.
>Given my assumptions, I claim that both my reasoning
>and conclusions were valid.
>
>Shame about the assumptions.
Here, lets look at some real life examples. I can't use
a Canon raw file, but a Nikon one will do the same (or
you can point me at a suitable Canon raw file on the web
somewhere and I'll demonstrate that it works *exactly*
the same way).
I have a file, d2x_8696.nef, that /exiftool/ shows to be
a 12-bit uncompressed NEF formatted data file, and the
image size is 4320x2868, for 12,389,760 pixels. The
file's size on disk is 20,330,936. If it actually has
12.3 million 12-bit data points, that should take up 1.5
times that many bytes (if they are as I said above, with
no zero padding and each 2 12-bit words being divided
into 3 8-bit octets) would require the file to be at
least (1.5 * 12,389,760) bytes in size, and the extra
would be overhead for the thumbnail and metadata. That
works out to 18,584,650, which leaves 1,746,296 for the
overhead. Those are very reasonable numbers, which
suggests that in fact the 12-NEF data is _clearly_ not
zero padded in any way.
Okay, then if I use /dcraw/ to convert the raw data to
PPM formatted image files in both 8-bit and 16-bit formats,
like this:
dcraw -4 -c d2x_8696.nef > 16bit.ppm
dcraw -c d2x_8696.nef > 8bit.ppm
I get files that are respectively 74.2Mb for the 16-bit
file and 37.1Mb for the 8-bit file. Since there are 3
channels in a PPM file (RGB), if we divide those numbers
by 3 we see how many bits per channel there are.
74,200,915 / 3 is 24733638 total bits for each channel
in the 16-bit file. If we divide by 2 (bytes), we get
12,366,819 pixels. Hmmm, you say... The NEF file said
it had 22,944 more sensor locations than that! And
indeed it does, but /dcraw/ generated a 4312x2868 image
in the PPM file, which is 12,366,816 pixels and should
take up 24,733,632 bytes in a 16-bit file. That leaves
a very reasonable number (6 bytes) for the overhead.
Run the numbers for the 8-bit file, and it works out just
exactly the same. Clearly there is no zero padding in
either file.
Then, another 8-bit file can be generated from the 16-bit
file produced by /dcraw/, this time using /pamdepth/:
dcraw -c d2x_8696.nef | pamdepth 255 > 8bitpam.ppm
And it produces a file exactly the same size, 37,100,465
bytes, that /dcraw/ did for an 8-bit output file.
--
Floyd L. Davidson <http://www.apaflo.com/floyd_davidson>
Ukpeagvik (Barrow, Alaska) floyd@apaflo.com