Talk:Requests for comment/DataStore

About this board

Sharihareswara (WMF) (talkcontribs)

Max mentioned to me that the code needs review. Furthermore, recent discussion of 3rd party dependencies is intriguing and Max may use one of Tyler's ideas. So this RfC is blocked on that.

Parent5446 (talkcontribs)

I mean, it depends on what he's trying to do. If you're looking for a library that allows you to interact with a key-value store like Cassandra or something, we may be out of luck, because there really aren't any.

Sharihareswara (WMF) (talkcontribs)

Max just told me that actually he's not awaiting any code review on DataStore.

Reply to "Current status"

Web service API with Cassandra backend

GWicke (talkcontribs)

I'm working on something fairly similar for Parsoid. We are mostly interested in revision storage, but can easily tack on a key/value store while we are at it. See User:GWicke/Notes/Storage#Key.2Fvalue_store_without_versioning for the key/value strawman.

The general idea is to spend some time on designing a good storage API, and expose that in a generic curl_multi based PHP class. A prototype implementation of of the revision storage part with Cassandra backend can be found at

Reply to "Web service API with Cassandra backend"
Tim Starling (talkcontribs)

<TimStarling> I think Brion has already said something favourable about this
<TimStarling> I don't see it on the page, maybe it was in an in-person meeting
<gwicke> I like the general idea of having a key/value store available without creating extra tables
<legoktm> what gwicke said
<mwalker> I voiced in the comments that I think this should have some sort of defined structure per key -- that way we can have a unified upgrade process (like we have with a database) and also have a way of filling in initial values
<yuvipanda> +1
<mwalker> otherwise I failed to see the difference between this and just using memcache
<legoktm> this would be persistent
<TimStarling> well, persistence
<legoktm> memcache isn't
<gwicke> in distributed storage range queries are not free, so it might make sense to make those optional
<gwicke> similar with counters
<TimStarling> mwalker: so you're thinking of some sort of schema definition for values?
<mwalker> yes
<mwalker> that way you have a defined upgrade / update process
<MaxSem> schema definition: serialize( $struct )
<mwalker> MaxSem: how do you handle a multiversion jump though? if the structure has evolved and suddenly you dont have the data you expect
<mwalker> you can handle that in the consuming code of course -- but that's a lot of boilerplate that I think is redundant
<TimStarling> mwalker: what do you imagine the upgrade process would be/
<MaxSem> if you want schemas and upgrades, it's a good readon to use MySQL tables
<TimStarling> ?
<legoktm> mwalker: i think that's something that the extension needs to handle, with proper deprecation
<legoktm> and migration
<gwicke> mwalker: all you need is a way to traverse the keys and update all values I guess
<mark> some update handler per key/value on fetch?
<gwicke> you can have a version key in each JSON blob you store for example
<mark> supported by the extension
<mwalker> could do it on fetch, or could do it in a maintenance script
<mark> whichever comes first
<TimStarling> mwalker: how would a schema help with upgrading? what boilerplate would be abstracted exactly?
<gwicke> we might want different kinds of key/value stores: those that are randomly ordered and only support listing all keys, those that are ordered and allow efficient range queries, and those with special support for counter values
<mwalker> TimStarling: I imagine that this will probably be abused to store dicts and arrays -- if we now what we're coming from and going to; we can define transforms for the old data into the new
<TimStarling> the requirement for prefix queries does appear to limit the backends you could use
<gwicke> yes, or at least it creates extra overhead for those that don't need the feature
<legoktm> mwalker: i dont think storing an array is abusing the feature ;)
<TimStarling> mwalker: abused?
<MaxSem> gwicke, if you don't want to use prefix queries, don't use them
<mwalker> TimStarling: the examples given in the RfC are simple values
<gwicke> MaxSem: yes, that's why I propose to have different key/value storage classes
<MaxSem> because there can be multiple stores, you can always make some assumptions about the store you're using
<mwalker> I say abused because I see no provision for dealing with more complex values (which is what I'm proposing :))
<gwicke> /ordered-blob/ vs /blob/ for example
<TimStarling> mwalker: maybe you misunderstood MaxSem then, because he just said he thinks values should be serialized with serialize()
<mark> it would probably be good to classify those different stores in the RFC, define the ones likely needed
<yuvipanda> mwalker: perhaps add more data types to the RFC? Lists and Hashes, maybe. I guess different stores can define different datatypes that they support
<gwicke> mark: I have some notes at
<mwalker> TimStarling: yes; but serialization doesn't solve the problem of knowing what's in the structure
<MaxSem> mark, the proposal comes with a skeleton code for an SQL store and has a Mongo as another example
<mwalker> if you serialize a php class for example -- deserializing it into a class with the same name but different structure gives very unexpected results
* gwicke lobbies for JSON over serialize()
<TimStarling> I imagine it would be used like the way memcached is used
<mark> yeah, nothing too PHP specific ;)
<MaxSem> gwicke, doable
<TimStarling> i.e. avoiding objects wherever possible, primarily serializing arrays, including a version number in the array
<MaxSem> :)
<TimStarling> when you fetch a value with the wrong version, the typical response in memcached client code is to discard it
<TimStarling> with persistent data, you would instead upgrade it
<gwicke> MaxSem: ok ;)
<TimStarling> that upgrade could be done by some abstracted schema system
<TimStarling> or it could be done by the caller, correct?
<mark> also, is this proposal intended to embrace larger key/value storage applications like... images? external storage?
<mwalker> TimStarling: yes -- that's where I'm going -- but I'm agitating for the schema system so the caller doesn't have to care every place its used
<mark> it doesn't seem to be, but I believe it's not mentioned
<MaxSem> mark, I intended to maybe use it for storing images on small wikis
<TimStarling> mwalker: I think you should write about your idea in more detail
<TimStarling> since this is not exactly a familiar concept for most MW developers
<gwicke> mark: the Cassandra stuff just came up in parallel
<mwalker> TimStarling: ok; I'll write that up tonight
<TimStarling> maybe you could even write a competing RFC
<mwalker> which do you think would be better?
<MaxSem> but it's too generic for an image store of our scale
<mark> when we're either talking about many objects into the millions, or potentially very large objects into the gigabytes, that can matter a lot :)
<TimStarling> mwalker: I would like to know what the API will look like before I decide
<TimStarling> and I would want comments from more people
<MaxSem> mark, the key here is "small wikis":)
<mwalker> TimStarling: ok; I'll write it up as a separate RfC
<TimStarling> yeah, I think that would be easiest
<TimStarling> now, there are obvious applications for a schemaless data store
<gwicke> mark: objects into the gigabytes are unlikely to be handled well by a backend that is also good at small objects
<mark> gwicke: that is my point
<TimStarling> because there are already schemaless data stores in use
<TimStarling> ExternalStore, geo_updates, etc.
<gwicke> mark: I'm interested in the 'at most a few megabytes' space
<MaxSem> so far to move this proposal forward I'd like people to agree upon interface
<gwicke> primarily revision storage
<mark> yes, we should probably make that a bit more explicit in the RFC
<TimStarling> is it possible to have both a schema data store and a non-schema data store?
<TimStarling> one could be implemented using the other
<TimStarling> I think that would suit existing developers better
<mark> 2 layers of abstraction
<TimStarling> yeah, well that seems like the minimum here
<TimStarling> schemas are not so simple that you would want to do them in a few lines of code embedded in a data store class, right? you would want to have a separate class for that
<mwalker> I think this could even overlay our current memcache
<gwicke> schema as in actually storing structured data and allowing complex queries on it?
<gwicke> that sounds like sql..
<mwalker> just getStore('temporary') or something
<MaxSem> another question: does anybody want eg getMulti() and setMulti()?
<MaxSem> mwalker, temporary is BagOStuff
<TimStarling> MaxSem: ObjectCache callers don't use getMulti very often...
<gwicke> MaxSem: I think it would be great to have that capability for any service backend
<yuvipanda> +1
<mwalker> this is a PersistantBagOStuff though :) why should the API be different
<TimStarling> in core, just filebackend, by the looks
<gwicke> can be based on curl_multi
<TimStarling> but it is generally considered to be a good thing to have
<mark> it's not always efficient to implement temp/expiry/caching with every service backend
<mark> oh, misunderstood
<TimStarling> no, I think persistent storage does need a different API
<mwalker> yes; mark raised a point I hadn't thought of
<TimStarling> well, ideally
<TimStarling> redis handles persistent storage well enough with a mixed API
<gwicke> there are some backends with built-in expiry
<mwalker> *if you set a TTL of zero; it goes into the persistant store?
<gwicke> amazon handles the ttl with special request headers
<TimStarling> anyway, BagOStuff brings a lot of baggage (ha ha)
<gwicke> mwalker: you set it per object normally
<TimStarling> presumably DataStore would be simpler than BagOStuff
<gwicke> same is available in cassandra
<gwicke> but would be good to check other backends
<TimStarling> it wouldn't have incr/decr or lock/unlock
<mark> swift does it, the swift compatible ceph counterpart doesn't
<TimStarling> with a simpler API, DataStore could have more backends than BagOStuff
<MaxSem> TimStarling, I actually have increment() - wonder if it's really needed
<PleaseStand> Would we need an atomic increment for things like ss_total_edits?
<gwicke> I'm pushing for a web service API
<MaxSem> it could be helpful eg for implementing SiteStats with DataStorew
<MaxSem> gwicke, web service API will be one of backends
<gwicke> PleaseStand: not atomic, but consistent
<gwicke> that should be a special storage class
<MaxSem> gwicke, know why memcached doesn't work over HTTP?
<TimStarling> MaxSem: maybe you should write a bit on the RFC about what backends you imagine this using, and what their capabilities are
<gwicke> MaxSem: efficiency for very small fetches
<mark> it's not UDP? ;)
<gwicke> afaik it's tcp
<TimStarling> w.r.t. prefix search, increment, lock, etc.
<mwalker> facebook wrote one with udp
<TimStarling> add, cas?
<MaxSem> stupid facebook
<TimStarling> ObjectCache provides all these atomic primitives
<mark> max size of objects
<gwicke> TimStarling: cas on etag?
<gwicke> can be supported optionally in some backends
<TimStarling> I just would like to know if the applications require all these atomic primitives
<TimStarling> and if that limits our backend choice
<MaxSem> TimStarling, cas doesn't seem to be very mixable with eventual-consistency backends
<TimStarling> essentially, there is a tradeoff between feature count and backend diversity, right?
<gwicke> I'd start with the minimal feature set initially
<TimStarling> so we want to know where on the spectrum to put DataStore
<mark> i think an application like gwicke is interested in (external storage like) is already quite different from the counter/stats like applications also discussed here
<gwicke> and then consider adding support for something like CAS when the use case and backend landscape is clearer
<TimStarling> that tradeoff is not discussed on the RFC, so I would like to see it discussed
<mark> agreed

Reply to "IRC meeting 2013-10-02"
Dantman (talkcontribs)

I'm not sure about the prefix querying like getByPrefix. Not every engine you'd want to connect this data store to is going to have a way to list keys by prefix. In fact since what you're doing is key-value storage most of the most efficient engines which you are most llikely to want to connect to this storage are not going to support that kind of query.

MaxSem (talkcontribs)

What engine do you have in mind? At a quick glance, this kind of searches is supported by MongoDB, Cassandra, CouchDB and DynamoDB.

Dantman (talkcontribs)

Riak, Voldemort I'd expect, LevelDB, Kai, MemcacheDB, there are probably some others. Think, actual key-value databases rather than higher level NoSQL databases.

Dantman (talkcontribs)

Ok, ignore LevelDB.

MaxSem (talkcontribs)

OK, rm LevelDB, while MemcacheDB and Kai are abandonware that nobody should use, what else? :)

MaxSem (talkcontribs)

Further, Riak supports prefix search while Voldemort doesn't have a reasonable PHP client at all so we don't need to cae about it at all.

Reply to "Prefix search"
Nikerabbit (talkcontribs)

When I was reading this I had two use cases in mind. In both cases the number of stored items can potentially grow to very big, but in practice into few (ten)thousands. Would it still make sense to use this kind of storage for those items? Being able to fetch all keys with prefix is needed for these use cases.

I guess I'm just asking more clarification when this could be used and when it should not be used. As well as some idea about possible migration paths when you need to do more complex things.

Reply to "What is this not useful for?"
Mwalker (WMF) (talkcontribs)

Right now this looks a lot like how we interact with memcache which is probably good -- but one of the nice things about use a database is that we know what's going to be coming back; and what the defaults are when we push partial data in. It is also possible with a DB to do simple upgrades to complex datatypes (e.g. add/modify columns).

I feel like this needs some sort of method to perform migrations & the ability to set default values at install time (much like our database updater.)

We may wish to enforce typing as well with something similar to ContentHandler such that we know what is currently in a key.

MaxSem (talkcontribs)

No, this completely contradicts the purpose of my proposal - it's intentionally schemaless. If a more structured storage is needed, it's outside of this RFC's scope.

Mwalker (WMF) (talkcontribs)

OK; but it's persistent data -- therefore it's expected to always be there; there shouldn't be a huge amount of boilerplate around making sure the value exists and if not filling it in like you would have in memcache. Therefore you have to manage it across versions -- how would you do a multiversion upgrade with this if you don't know what's in the key? How do you ensure the data is initially populated? How do we ensure that this doesn't explode with eventually unused keys? IMO, there must be some method of managing it in software.

Mwalker (WMF) (talkcontribs)

I guess what I'm really trying to push here is that if you really just want to use the same API and the same boilerplate requirements as we have around memcache -- this should just transparently come into play when we specify a TTL of 0; there should be no need for the programmer to manage this.

We'll still have all the problems of managing the keyspace with ttl's of zero though.

Reply to "Default values & In row types"
There are no older topics
Return to "Requests for comment/DataStore" page.