Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: CIF-JSON new draft

Do you consider that significant or not?

JavaScript:

 x = json.example["_flight.vector"]
[0]
 c = 0;
t = +new Date;
for (var i = 0; i < 10000000; i++) {
  c+= parseFloat(x);
}
t = +new Date - t;
document.write(t + " ms");

643 ms

That was 10,000,000 conversions. To me, that's acceptable. JavaScript is highly optimized. I'm sure the string->number conversion is one of its most optimized features.

Again, if you want actual "numbers" in the JSON, then you are asking, probably, for:

server/source: number -> string
web page: string -> number

whether that string -> number is done by the page code writer using parseFloat() or by the JSON interpreter using JSON.parse(), it is almost certainly using the same native code. Actually, the JSON.parse() is more time consuming, because it has to figure out what sort of value we have; as CIF developers, we would know we have a float:

x = json.example["_flight.vector"][0]

t = +new Date;
var c = 0
for (var i = 0; i < 10000000; i++) {
  c += parseFloat(x);
}
t = +new Date - t;
document.write(t + " ms parseFloat()");
document.write("<br>")
t = +new Date;
var c = 0
for (var i = 0; i < 10000000; i++) {
  c += JSON.parse(x);
}
t = +new Date - t;
document.write(t + " ms JSON.parse()");


659 ms parseFloat()
1999 ms JSON.parse()

So in JavaScript it is 3x faster to do it yourself rather than rely on the generic JSON parser.

Bob





On Mon, May 1, 2017 at 12:03 PM, Marcin Wojdyr <wojdyr@gmail.com> wrote:
>
> Since we are targeting this as a low overhead representation, which I associate with performance considerations, I am prepared to entertain arguments about performance impact.  I am not, however, prepared to accept unsupported assertions about performance.

Fair enough, here is a microbenchmark for you. First using Python2.7:

$ python -m timeit -s 'import json,sys' 'f=open("numbers.json"); numbers = json.load(f)["numbers"]'
100 loops, best of 3: 10.2 msec per loop
$ python -m timeit -s 'import json,sys' 'f=open("strings.json"); numbers = [float(x) for x in json.load(f)["strings"]]'
10 loops, best of 3: 58.1 msec per loop

In this case using strings is 5x slower.

Now with Python 3.5:

$ python3 -m timeit -s 'import json,sys' 'f=open("numbers.json"); numbers = json.load(f)["numbers"]'
100 loops, best of 3: 12.7 msec per loop
$ python3 -m timeit -s 'import json,sys' 'f=open("strings.json"); numbers = [float(x) for x in json.load(f)["strings"]]'
10 loops, best of 3: 27.4 msec per loop


The difference is smaller, but still >2x.

The input files were prepared using this script:

import json
import random
numbers = [round(random.uniform(-30, 30), 3) for _ in range(100000)]
with open('numbers.json', 'w') as f:
    json.dump({'numbers':numbers}, f)
with open('strings.json', 'w') as f:
    json.dump({'strings':[str(x) for x in numbers]}, f)


$ du -h strings.json numbers.json
984K    strings.json
788K    numbers.json


Marcin

_______________________________________________
cif-developers mailing list
cif-developers@iucr.org
http://mailman.iucr.org/cgi-bin/mailman/listinfo/cif-developers




--
Robert M. Hanson
Larson-Anderson Professor of Chemistry
St. Olaf College
Northfield, MN
http://www.stolaf.edu/people/hansonr


If nature does not answer first what we want,
it is better to take what answer we get.

-- Josiah Willard Gibbs, Lecture XXX, Monday, February 5, 1900

_______________________________________________
cif-developers mailing list
cif-developers@iucr.org
http://mailman.iucr.org/cgi-bin/mailman/listinfo/cif-developers

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Council for Science (admitted 1947). Member of CODATA, the ICSU Committee on Data. Member of ICSTI, the International Council for Scientific and Technical Information. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

ICSU Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.