R
Robin Becker
Is the any way to get an efficient 16bit hash in python?
Robin said:Is the any way to get an efficient 16bit hash in python?
yes I thought of that, but cannot figure out if the internal hash reallyJosiah said:hash(obj)&65535
- Josiah
Robin Becker said:Is the any way to get an efficient 16bit hash in python?
Robin said:yes I thought of that, but cannot figure out if the internal hash really
distributes the bits evenly. Particularly since it seems to treat integers etc
as special cases
Is the any way to get an efficient 16bit hash in python?
yes I thought of that, but cannot figure out if the internal hash really
distributes the bits evenly. Particularly since it seems to treat integers etc
as special cases
Robin Becker said:yes I thought of that, but cannot figure out if the internal hash
really distributes the bits evenly. Particularly since it seems to
treat integers etc as special cases
Martin v. Löwis said:So: what are your input data, and what is the
distribution among them?
I'm trying to create UniqueID's for dynamic postscript fonts. According to my.......
So: what are your input data, and what is the
distribution among them?
Regards,
Martin
Robin said:Martin v. Löwis wrote:
0 the ideal hash
can't be argued with
I'm trying to create UniqueID's for dynamic postscript fonts. According.......
So: what are your input data, and what is the
distribution among them?
Regards,
Martin
to my resources we don't actually need to use these, but if they are
required by a particular postscript program (perhaps to make a print run
efficient) then the private range of these ID's is 4000000<=UID<=4999999
ie a range of one million.
So I probably really need an 18 bit hash
The data going into the font consists of
fontBBox '[-415 -431 2014 2033]'
charmaps ['dup (\000) 0 get /C0 put',......]
metrics ['/C0 1251 def',.....]
bboxes ['/C29 [0 0 512 0] def',.......]
chardefs ['/C0 {newpath 224 418 m 234 336 ......def}',......]
ie a bunch of lists of strings which are eventually joined together and
written out with a template to make the postscript definition.
The UniqueID is used by PS interpreters to avoid recreating particular
glyphs so ideally I would number these fonts sequentially using a global
count, but in practice several processes separated by application and
time can produce postscript which eventually gets merged back together.
If the UID's clash then the printer produces very strange output.
I'm fairly sure there's no obvious python way to ensure the separated
processes can communicate except via the printer. So either I use a
python based scheme which reduces the risk of clashes ie random or some
data based hash scheme or I attempt to produce a postscript solution
like looking for a private global sequence number.
I'm not sure my postscript is really good enough to do the latter so I
hoped to pursue a python based approach which has a low probability of
busting. Originally I thought the range was a 16bit number which is why
I started with 16bit hashes.
Thomas said:Robin Becker wrote: ........
For identifying something, I suggest you use a hash function like sha1
truncating it to as much as you can use, similarly to what Jon Ribbens
suggested.
to my resources we don't actually need to use these, but if they are
required by a particular postscript program (perhaps to make a print run
efficient) then the private range of these ID's is 4000000<=UID<=4999999
ie a range of one million.
So I probably really need an 18 bit hash
The data going into the font consists of
fontBBox '[-415 -431 2014 2033]'
charmaps ['dup (\000) 0 get /C0 put',......]
metrics ['/C0 1251 def',.....]
bboxes ['/C29 [0 0 512 0] def',.......]
chardefs ['/C0 {newpath 224 418 m 234 336 ......def}',......]
ie a bunch of lists of strings which are eventually joined together and
written out with a template to make the postscript definition.
.......And the UniqueID should be unique within this file, right?
Why don't you just use a serial number then?
(where I cannot control which other UniqueID's might be present).
Luckily the cheap option of not using the UniqueID at all is available,
but chances are some printer ps interpreter will barf if it's not
present and then I need a fairly robust way to generate reasonable
candidates.
Robin Becker said:whether it's any better than using the lowest bits I have no real
idea. I suppose (sha being flavour of the month) I should really use
Want to reply to this thread or ask your own question?
You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.