Data model editEdit
In a recent series of edits you added to Wikibase/DataModel this sentence:
The data model is conceptual ("Which information do we have to support?") and does not specify how this data should be represented technically ("Which data structures should the software use?") or syntactically ("How should the data be expressed in a file?").
But a person or group who wishes to extract data from the database must know what format the data will be in if it is to be used, especially if large quantities of data are being extracted. What document can the users refer to so they will know what the data format is? Jc3s5h (talk) 14:34, 28 February 2015 (UTC)
- I did not add this sentence. This sentence has been there from the very first revision Markus Krötzsch wrote of this document. In fact, the first revision of this document (on meta.wikimedia.org back then) consisted only of this sentence, see . I did update the yellow box to also state something similar, so it won't be missed so easily when hunting for the juicy bits of the page.
- The only binding currently documented is JSON. The specification document is maintained along with the program source code for consistency, but there is a copy on this wiki, see Wikibase/DataModel/JSON. There are some minor updates pending. If you feel that there is anything missing there, please let me know. I recommend against editing the wiki page, since it's just a copy of the primary document. If you want to submit a patch for json.wiki, please do!
Thank you for making clarifications in this edit. I immediately noticed a questionable phrase 'Years BCE are represented as negative numbers, following the traditional ordering, in which year 0 is undefined, and the year 1 BCE is represented as -0001,...' [emphasis added]. The traditional users of negative years are astronomers, and for them, the year 0 is most certainly defined, that's why they do it! It makes their arithmetic easier. I think a different phrasing is needed. Jc3s5h (talk) 19:22, 11 August 2016 (UTC)
Another point is that time zone offsets are currently not in use, which means that in effect, all the times in the database are local times. That's certainly how people have been using it; they read a source that X died in Hawaii on February 12, 1893, and that's what they put in. If you ever wanted to redefine the dates to be UT, you would have to go through the whole database, figure out the location, figure out what the time zone was at the date of the event (don't forget daylight saving time) and add the offset. Some entries won't have enough information to do that. I think your stuck with local time forever. Jc3s5h (talk) 19:36, 11 August 2016 (UTC)
- Hi Jc3s5h! Thank you for your feedback. Times are indeed usually local times, and that's how it generally should be, since that's how they are typically given in sources. Wikidata tries to store information as given in the source, with only syntactical normalization. Normalization of units, calendars, time zones, etc, should of course be done for the query service, so values can be compared. You are correct that it would take quite a bit of effort to add time zone information to all the times we have, but currently, there does not seem to be much demand for this, since we don't have any times of day. When we add support for the time of days, we will also need to add support for time zones, and I imagine people will then start to enter them. I imagine that typically, dates without a time will not have a time zone set (though of course it would be good to have a time zone, even without a time of day).
- With regards to the "traditional" numbering: it's the numbering used with the traditional AD/BC notation. I did not mean to imply that it is the traditional interpretation of negative years. I have now changed the spec to say "historical numbering", since it's the numbering used by historians. It's also the numbering used by the "historical" versions of ISO (until 2000) and XSD (until 2012) -- Duesentrieb ⇌ 11:56, 12 August 2016 (UTC)
- I can't wrap my head around the idea that years like AD 4, 7 BCE, etc., can be considered the same numbering system as 4, -7. To me, the use of a negative sign rather than an abbreviation makes it a different system. So, to the best of my knowledge, historians never use -7 to mean 7 BCE and seldom use -6 to mean 7 BCE. Jc3s5h (talk) 22:20, 25 August 2016 (UTC)
|Lcy2000 (talk) 14:58, 12 February 2016 (UTC)|