Projections with encoding possibilities
Hello All,
Had some of question regarding encoding type with respect to types of projection.
Want to know, how does Vertica behave in every of these scenarios
1. Create a table with encoding type..It creates projection with encoding type specified
2. Create a table without encoding type .. Does it create a super projection with encoding type as AUTO(if no type specified), Why does it not give encoding type on its own.
2. Create custom projection after already existing super projections. Does custom projection copy column encoding ?
3. use a CTAS query and create a new table..Does new projection copy column encoding from previous projections
4. using Like keyword with "including projections" ..Does the new projection copy column encoding from previous table
I am seeing many of my super projections having column encoding as AUTO..which I think Vertica should automatically comeup with the encoding types.
Also is there any relation of the column encoding with create type of projection (delayed creation, create table with PROJ clause)
According to previous versions Vertica would suggest the encoding type automatically if no encoding is specified, Is there some change in this behavior.
Thanks
Had some of question regarding encoding type with respect to types of projection.
Want to know, how does Vertica behave in every of these scenarios
1. Create a table with encoding type..It creates projection with encoding type specified
2. Create a table without encoding type .. Does it create a super projection with encoding type as AUTO(if no type specified), Why does it not give encoding type on its own.
2. Create custom projection after already existing super projections. Does custom projection copy column encoding ?
3. use a CTAS query and create a new table..Does new projection copy column encoding from previous projections
4. using Like keyword with "including projections" ..Does the new projection copy column encoding from previous table
I am seeing many of my super projections having column encoding as AUTO..which I think Vertica should automatically comeup with the encoding types.
Also is there any relation of the column encoding with create type of projection (delayed creation, create table with PROJ clause)
According to previous versions Vertica would suggest the encoding type automatically if no encoding is specified, Is there some change in this behavior.
Thanks
0
Comments
These are good questions; thanks for asking! In answer:
1. Yes, as you've said.
2. Yes, also, as you've said. The column isn't restricted to a specific encoding type because the optimal encoding type depends on the data in the column; simply knowing the data type isn't good enough. (Well, it's good enough for AUTO; but we can do much better knowing something about the data.) When you issue the CREATE TABLE statement, you haven't loaded any data yet. That's why people on other threads have said things like "you're not finished creating the table until you have loaded some data and run the DBD" -- that's how we know that there's enough data for us to choose column encodings that will work well going forward.
3. CTAS does not copy encodings from previous projections. CTAS is often used to simply copy a table. But the SELECT statement can be anything; one table, a JOIN, an analytic function with no table behind it at all... Most of those cases would not have a well-defined column encoding. (The JOIN case might, for an inner join, but it also might have a bad choice depending on what records are filtered out by the JOIN. Also depending on what queries you want to run against the new table; etc.) The single-table case does have a "correct" answer for "what projections and encodings should I use?", but CTAS doesn't currently special-case for that scenario.
4. The LIKE keyword with "INCLUDING PROJECTIONS" should copy encodings from the specified table. If you want to make a new table that looks exactly like an existing table (including physical layout on disk, etc), this is how I would recommend that you do it.
There are only three ways (as of Vertica 7) that I know of, for a table to get encoding information that is more-detailed than "AUTO":
- Table is created with LIKE .. INCLUDING PROJECTIONS, from a table with encodings defined
- The DBD has picked encodings for the table (requires running the DBD)
- The table (or projection) was declared with explicitly-specified encodings
I don't believe this has changed recently.
If you want new tables to be given encodings automatically, I would suggest that you run the DBD as part of your ETL process (or however else you create tables). Note that, as of Vertica 7, this can be done from SQL:
https://my.vertica.com/docs/7.0.x/HTML/index.htm#Authoring/SQLReferenceManual/Functions/VerticaFunct...
You'll find that determining the right encodings (and sort order, etc -- changing the sort order totally changes which encodings will work well on the sort columns because their physical layout is different) is a somewhat expensive operation. There's a reason we don't do it automatically :-)
Adam
That really helped :-)