The json-based query given above is also buggy, because jsonb comparisons are sensitive to data type so the jsonb values “12345678” and 12345678 are unequal: so this: 123456789 as a number won’t be identified as a duplicate even though it should be.No it shouldn’t. Even though the two values might look different the way you write them, they are different.
Had you choosen to write numbers in hex, the two values would not look similar. Note that this is arguably a PostgreSQL limitation, since a JavaScript intepreter’s loose typing means they compare as equal: 12345678 “12345678” = trueNote that in the JavaScript community that is considered a bug and the use of the non-type-coercing equals-operator “” is encouraged. Anders, interesting point re comparison. I’m actually happier knowing that it’s frowned upon in the JS world, because I’ve never been a big fan of Perl and JavaScript’s habit of implicitly converting strings and numbers for comparisons. My issue was that behaviour was inconsistent with Pg, but it seems all we really need to do there is to document that jsonb implements equality tests to match the JavaScript non-coercing equals operator “” or something along those lines.As for “should be” I was speaking narrowly of the use case in the example, where the user would clearly want to compare “13245678” and 12345678 as equal. In general it’s clearly best to do just what jsonb does by default for comparisions – but to try to stick to relational, typed fields most of the time.
The reason I would like to use JSON in this way is that i want to be able to create files, in the sense that I create a folder on a certain person where I want to gather all different information on this person as files in one spot. With the kind of table setup (an id and a jsonb field) this becomes possible.The problem I experience with the traditional relational approach is that for every type of information I need to create a separate table, and in some cases a join table in order to get the information in the database. Whenever you need to add a new kind of type, or a change to an existing type, not only do you need to change the API and all applications, but the database as well.Having the information spread out in so many spots makes it harder to retrieve in case of problems, or you lose the coherence of the information because one join table decides to go corrupt.
Postgres Query Json Array Of Objects
If that happens, you don’t lose the coherence of the information on only one person, but I loose it on every person in the database.Do you think this approach is an anti-pattern? How would you implement something like this without using the jsonb type this way?. It sounds like you’re not a fan of the relational approach or it doesn’t fit your application. What you describe is the object store or document store model. If you prefer this model or it’s a better fit for your application that’s fine – but rather than trying to force PostgreSQL into working this way, consider using a dedicated document database like MongoDB.That said, for many apps changes to structure are relatively infrequent and easily managed, and much of the data benefits from the flexible querying that relational storage provides via aggregation, windowing, etc.
PostgreSQL’s json support really comes in handy when you have mostly relational data, but you also need to store some dynamic attributes or freeform data.I’m a bit concerned by your comment about a table “deciding to go corrupt”. Have you been using MyISAM recently? In PostgreSQL if you’ve got corruption anywhere you should be very concerned, and there’s no reason to assume that just one table will be damaged by whatever happened (PostgreSQL bug, filesystem bug, hardware fault, etc). It’s more likely in my experience that all tables recently written to will be damaged in the same way; see e.g. The multixact issues.
Postgres 32 Bit Odbc Driver
Isolating your data into little pieces to protect against corruption is not the solution to data corruption issues, it’s at best an incomplete and clumsy workaround to an issue that shouldn’t be present in the first place. It is not so much being a fan of the relational model or not, but rather the idea that because PostgreSQL is at heart a relational DB which now supports json as a field type this kind of approaches to using it are by default an anti-pattern.The relational model has its strengths and weaknesses, as I pointed out with the remark about the table going corrupt. I didn’t necessarily mean that the cause of that corruption would be postgres or the file system, but any kind of corruption, including the one introduced by SQL injection and / or improper use. The normalization process of data of different “type” essentially makes the coherence between these types both implicit and generic, whereas JSON allows the coherence to be flexible and explicit.I know that I described a document store, I have been using them for a few years now, but I would like to use a document-like store, but with SQL-like search capabilities.
In the jsonb type this kind of queries this becomes possible, as well as combining it with traditional relational approaches. It is funny how people call it an anti-pattern, when it is absolutely necessary and vital to certain applications. Best example is ecommerce.
EAV cannot be avoided when you want to create a reusable webshop, because you dont know upfront what products the users will store. Or do you recommend to put columns like t-shirt size or height of an aquarium or ingredients of cosmetics into the relational model for each different user? Ridiculous!Stop calling it an anti-pattern. It is valid design technique. Thanks for this older post, but still relevant.
I for example consider now hstore or eav model for configuration data (something like Java properties files), but living in PostgreSQL.Regarding intention to better going to document based storage instead of messing around with postgres I was doing some synthetic testing few months back with JSON and PostgreSQL vs MongoDB and PostgreSQL with GIN indexes was a winner. With Citus extensions you can scale across nodes easily and I also finally used /dev/ramdisk0 for read-only nodes with replication from write only nodes and it is lightning fast (my tests were about 10mil records per table +.
I also noticed quite small footprint on storage with JSONB. And there is still compression on top of this available. I think the JSON consumed even less storage than table design, which was surprise to me but I am not sure if I remember correctly.NOTE: I am fun of this kind of generic approaches 😉 but I agree they can make things worse, interesting reading. Yeah, it’s probably not coincidence that Mongo’s analytics and BI connector product uses PostgreSQL.
I think that’s entirely sensible of them, and it may well have nothing to do with the json/jsonb support since the BI connector is focused on exposing Mongo data to relational queries for use in existing and widely established BI and data mining tools. Still interesting.It wouldn’t shock me at all if json or jsonb were more compact on disk.
The whole datum can be compressed, whereas for normal postgres rows we compress only large individual values. You’re also spending less space on padding and alignment, headers, etc. Indexing with jsonb is certainly helpful, but they’re fairly large indexes (GIN, GiST), and they are more likely to have to pay the cost of out-of-line TOAST storage too. I haven’t done comparative performance studies on jsonb vs EAV models and would be curious about results from those who’ve done so.Fetching values from an indexed jsonb column will definitely not perform as well as fetching individual values from a regular table.If your choice is EAV vs jsonb, you should probably use jsonb.
But you should prefer to model it relationally, not rely on EAV or jsonb or hstore or xml. I have been keeping a close eye on our query performance – as ever – and recently watched the speed of two sets of queries: One queries the join of a handful of tables, in the other all the data is together in one JSON blob. Surprise surprise, the JSON was outperforming the joins by a factor of six.
On further investigation, what did it come down to? Pulling one JSON blob out of the database required one cache miss. Pulling together data from five tables in a join+group by meant many more cache misses. I am now very seriously considering having JSON blobs + a few columns for IDs.
The IDs are there for the relational part of SQL, and are to be used as little as possible. Whilst the lack of an enforced structure may make database designers cringe, performance wise it makes a lot of sense.
Just because you have relations doesn’t mean that they are cheap. This is a great article and this is about the 4th time I’ve come across it looking up postges’ json capability for various reasons. JSON inside of relational database is not ideal, and I’m not sure why anyone would ever start there, which is what this seems to be warning against. The only time I’ve ever had to deal with json fields in a relational DB have been in enterprise situations where the technology decisions were driven by the existing platform. Maybe it’s more common than I think.I thought it might be useful for future visitors to have a concrete example of what I consider good use case for json in postgres that I think falls in line with what the authors views as well.
Feel free to correct me on that.The one case where I’ve really found success in using json inside of a relational db has been when I needed to denormalize tables into a complex structure. Typically, this is a lot of heavy lifting so I keep them in a materialized view but I could see cases where one might want to use a plain old view.In my case, I inherited a postgres backed app that needed search bolted on. The search requires specific filters which are hierarchical. Using a materialized view and jsonb, I can wrap up everything I need for the javascript in one upfront call. The only calls back to the controller from that point on are to perform the search.
The view is indexed by the object at the top of the hierarchy and includes all the data that will ever be needed to form the search query wrapped up in json fields. This is a rails app, so having the structure already in json does marginally help rendering performance. I tried this from all angles ( denormalizing in app code, storing denormalized objects in elastic search, multiple smaller views. In the end allowing postgres to maintain its mat view in the background and automatically update when the underlying data changes was the best way I found to accomplish this.This is a bit of a digression from the topic, however if you’re a Rails developer who is hesitant to use database views because AR doesn’t have native support for view migrations.
Check out the Senic gem. (I’m not affiliated) It just makes views really manageable and familiar for rails development.