21:06 UTC

< April 2017 > Su Mo Tu We Th Fr Sa 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

- Console
- #amber
- #apicula
- #arm-graphics
- #arm-netbook
- #bitcoin-wizards
- #buildbot
- #bundler
- #cinch
- #coiniumserv
- #coiniumserv-dev
- #crystal-lang
- #cubieboard
- #datamapper
- #discferret
- #elliottcable
- #forth
- #glasgow
- #gridcoin
- #gridcoin-dev
- #homecmos
- #huawei-g300
- #imx6-dev
- #imx6-dongle
- #ipfs
- #jruby
- #libreelec
- #libreoffice-ru
- #lima
- #linux-amlogic
- #linux-exynos
- #linux-rockchip
- #linux-sunxi
- #lisp
- #litex
- #logarion
- #lowempire
- #maemo-leste
- #maglev-ruby
- #microrb
- #milkymist
- #mirage
- ##moved_to_libera
- #mutant
- #nanoc
- #neo900
- #nextbsd
- #nmigen
- #ocaml
- #opal
- ##openfpga
- #openwrt-devel
- #panfrost
- ##panfrost-offtopic
- #Paws
- #Paws.Nucleus
- #picolisp
- #ponylang
- #prjmistral
- #pypy
- #qaul.net
- #qi-hardware
- #racket
- #radxa
- #reasonml
- #river
- #rom-rb
- #rubinius
- #ruby
- #ruby-core
- #rubygems
- #rubygems-aws
- #rubygems-trust
- #ruby-lang
- #ruby-rdf
- #sandstorm
- #scopehal
- #skywater-pdk
- #slime
- #soletta
- #stellar
- #stellar-dev
- ##stm32-rs
- #symbiflow
- #systemtap
- #teamhacksung
- #teamhacksung-support
- #tinyqma
- #videocore
- #wallaroo
- #xiki
- #xtompp
- ##yamahasynths
- #yosys
- #zig

sipa changed the topic of #bitcoin-wizards to: This channel is for discussing theoretical ideas with regard to cryptocurrencies, not about short-term Bitcoin development | http://bitcoin.ninja/ | This channel is logged. | For logs and more information, visit http://bitcoin.ninja

<nicolagreco>
are there ways to prove that a message m is of size L, given H(m) being public knowledge?

<gmaxwell>
what does size L mean? I mean if I take a message 'a' and then pad it with 500 \0s ... what size does it have?

<nicolagreco>
hey Taek (hi!) I want to promise you that I have a file and that it is of the size that I claim

<gmaxwell>
okay so splitth message into blocks. length is in units of blocks. (I trust this is fine). H(m) is a tree hash. I send you "my message will be less than X blocks, it has root hash Y-- signed greg"

<gmaxwell>
now I start streaming blocks to you along with the extra tree fragments so you can constantly verify the blocks that I send you match the root.

<gmaxwell>
then if I send too many blocks, you can prove it by showing the X+1 member and my signed message.

<Taek>
I'm struggling to see how this could be useful though? What's stopping you from sending blocks that aren't a part of the root?

<gmaxwell>
Thats the story of cryprocurrency, 'Your scientists were so preoccupied with whether or not they could, they didn’t stop to think if they should.'

<gmaxwell>
I still think your 'how long' is kind of ill defined. e.g. make H() a tree hash and then you can have me show you the last block... to 'prove the length'. but I think that proof is mostly meaningless in most contexts.

<nicolagreco>
if you show me the last block of a merkle tree, how do I know it proves the length? you could always have hashed the previous blocks with different block sizes

null_radix has quit [Quit: EliteBNC free bnc service - http://elitebnc.org - be a part of the Elite!]

<nicolagreco>
so I guess that should be the requirement, and the question now would be "can I prove without interaction that the file that I hold is of size L?"

<vega4>
it is not and it should NOT be possible to get to know the size of the message knowing its hash; if you could then the hashing function would be broken

<gmaxwell>
he has for you to be able to prove it, this would be done by sending additional information along side the hash.

<vega4>
sorry; thats why I said I haven't had time to read everything but the very subject sort of stroke me

<vega4>
if the server is to prove it is in posession of the data; sending part of the file to the client would give nothing as long as server sends 100% of the file. if you devide the file into blocks and send a hash of a single block. then this would work.but then you would need a lot of hashes; a sufficient number that it would be infeasible for the server to store every chunk of data repreented buy a set of these hashes. or m

<vega4>
I simply cant imagine any other way of proving pesession on bare data without random query for some chunks. and client does not want to store every chunk so he stores hashes of blocks

<vega4>
and then some other parties query from time to time the server faisl to respond..ho loses stake blabla

<vega4>
it seems to me that pople noticed a flying bitcoin and now would also like to have a flying cat

<vega4>
and now... now I'm going to write a paper on 'PROOF-of-storage' no wait... I'm going to start a campaign just like Safe coin and get 20mln usd

<FNinTak>
there is forced interaction in the case of receiving what looks like a correct proof though, right?

<FNinTak>
I.e. if you prove to me the size of some file, & I choose to store it, then you should send file, & possible more verbose proof

<vega4>
ah you mean Bob is sending a File to Allice; Alice accepts the byte stream; and then Bob is able to query Allice for some parts of the file

<vega4>
that is correct; but it has nothing to do with driving file size from hash which was the Op's topic I think

<vega4>
" are there ways to prove that a message m is of size L, given H(m) being public knowledge?" that was the case after all I think

<vega4>
I believe this article is the only one in the Internet where the term "Homomorphic Identification Protocol" is present

<vega4>
Endomorphicly Homomorphic Computationaly Feasible Identification Protocol Based On Factorizan in Finite Fields

<bsm1175321>
Hash functions used to implement e.g. sets or hash maps generally have small output lengths only marginal collision resistance, and they're generally optimized for speed.

<bsm1175321>
They still need collision resistance though because if an attacker can find collisions, he turns your algorithm from O(1) to O(N) and it's a DoS vector.

<bsm1175321>
e.g. https://asfws12.files.wordpress.com/2012/11/asfws2012-jean_philippe_aumasson-martin_bosslet-hash_flooding_dos_reloaded.pdf

<Taek>
yeah but those are hash functions that don't propose to securely provide the properties of preimage resistance and partial preimage resistance

<bsm1175321>
So you're proposing a hash function which has preimage resistance but poor collision resistance? Doesn't one imply the other?

<Taek>
attacker generally needs to know the input to find a collision in a reasonable amount of time

<Taek>
they don't imply eachother, I know this b/c md5 offers preimage resistance but not collision resistance

<andytoshi>
Taek: there are definitions in the literature for one-way functions, pseudorandom functions, pseudorandom generators, etc

<andytoshi>
we usually use "hash function" as "random oracle" here because being a random oracle implies basically every property that you might need, and also it makes a lot of proofs possible

<andytoshi>
but non-random oracle proofs generally specify exactly the properties of their hash functions that they need

<andytoshi>
(and they use "hash families" which are indexed sets of hash functions, and they have fixed-width inputs, and in general this area of academic crypto seems ___designed___ to be totally unreadable)

<Taek>
bsm1175321: "On 30 December 2008, a group of researchers [snip] had used MD5 collisions to create an intermediate certificate authority certificate that appeared to be legitimate when checked by its MD5 hash"

<bsm1175321>
Taek you led me down a rabbit hole that ended up on XOF's (eXtensible Output Functions) -- hash functions with variable sized output, like SHAKE256.

<bsm1175321>
Seems to me that Merkle tree (and similar constructions) which hash together a set of data, actually NEED an XOF instead of a fixed size hash, since if you hash N inputs together, the hash's security should be 256-log(N) roughly.

<bsm1175321>
That is, if you make another Merkle tree with N inputs, the probability of a collision isn't 2^-256, it's lower because of the amount of inputs.

<bsm1175321>
Given the large amount of data that goes into a blockchain, perhaps an XOF is an interesting object for creating Merkle trees, MMR's/Merklix type constructions, where the Merkle root has size e.g. 256+log(N) where N is the number of data elements that went into it.

<bsm1175321>
It might similarly be an interesting construction for PoW...where the PoW proof size is extended by the number of leading zeros...

<bsm1175321>
Taek: maybe Merkle trees are a bad example...there is a higher probability that there was a collision ***somewhere*** in the tree, but I think you're right, the security of the root is unchanged.

<Alanius>
Taek: in order to break a Merkle tree, you have to find a second preimage for any one node in the tree; that is easier than finding a preimage to one specific node.

<bsm1175321>
And actually there are O(N*log(N)) hash computations to create a Merkle tree...so the length extension needs to be a bit larger than I said to maintain the same security.

<Taek>
and for imbalanced trees you will do at most one extra hash per layer, so it's worst case 2N + log(n)

<bsm1175321>
What's the probability there's a hash collision, somewhere in one of Bitcoin's many merkle tree histories...it's not 128 bits.

<Taek>
you can find second preimages faster, but only by a factor of log(n) I think, and your second preimage security is 256bits, so unless your hash tree is 2^2^128 nodes, you are still safe there

<Taek>
hmm. Well first off I don't think it matters for the most part unless the collision occurs in exactly the same spot on two different trees

<Taek>
but even so, I'd imagine that there are less than 2^32 total hashes in all of bitcoin's Merkle trees

<nsh>
some ECDH pubkey validation controversy: https://research.kudelskisecurity.com/2017/04/25/should-ecdh-keys-be-validated/

<bsm1175321>
Yes, so with a 256 bit hash function you'll have 50% probability of finding a collision after computing 2^128 hashes.

<bsm1175321>
Now I wish to hold 2^128 constant as I add hashes to a Merkle tree, by extending the output size of the hashes.

<bsm1175321>
So if I generate one new hash, the probability that it has a collision with an existing hash is 2^(128-32) = 2^(96). Therefore I should be using a hash function today that is larger by 64 bits = 320 bits, to hold the 128 bit hash-collision security factor constant.

<Taek>
no, the probabilitiy that it has a collision with an existing hash is 2^256 / (num existing hashes)

<Taek>
If you have 2^128 hashes and zero collisions, the probability of getting a collision on your next hash is 2^-128

<bsm1175321>
Taek: no, if you have 2^128 256-bit hashes and zero collisions, the probability of getting a collision on your next hash is 50%. That's the birthday problem.

<Taek>
if you have 2^128 hashes and you haven't checked if you've had a collision yet, the probability that you have a collision is 50%

<bsm1175321>
That's not actually the case here...no one is checking that intermediate hashes in Bitcoin's Merkle trees don't already have collisions.

<bsm1175321>
So the probability that there's ***already*** a collision in the 2^32 hashes is 2^-96. (and that number is the same for any new one I add)

<Taek>
we've only got 2^32, each of which has at most (actually less, b/c of merkle tree structure) 2^-(256-32) chance of colliding with another

<Taek>
so each of the 2^32 hashes has a 2^-224 chance of collision. Which means 2^-192 chance total of collision in the existing tree

<bsm1175321>
Your argument is not compatible with the birthday calculation. The sqrt enters because I can have a collision with ***any*** other hash.

<bsm1175321>
Following your logic, if I had 128 hashes, I'd have a 2^-128 probability of finding a collision, and that's not the case. It's 50% by the birthday attack problem.

<bsm1175321>
So the required hash size to retain a fixed collision parameter is S+2*log(N) where S is the size of the starting hash function with no elements in Merkle trees, and N is the number of hashed elements so far.

<bsm1175321>
Taek: I know. I'm just interested abstractly in what's required to fix this security parameter.