In Vertica, there is a possibility to use Sampling. Could you please explain which algorithm is used for sampling? How effective is this algorithm if it is not possible to detect distribution?
From parameter doc: "TABLESAMPLE(percent) Specifies to return a random sampling of records [...] The number of records returned is not guaranteed to be the exact percentage specified. All rows of the data have equal opportunities to be selected. Vertica performs sampling before applying other query filters."https://www.vertica.com/docs/12.0.x/HTML/Content/Authoring/SQLReferenceManual/Statements/SELECT/FROMClause.htm
Thanks!
Answers
From parameter doc: "TABLESAMPLE(percent) Specifies to return a random sampling of records [...] The number of records returned is not guaranteed to be the exact percentage specified. All rows of the data have equal opportunities to be selected. Vertica performs sampling before applying other query filters."
https://www.vertica.com/docs/12.0.x/HTML/Content/Authoring/SQLReferenceManual/Statements/SELECT/FROMClause.htm
Thanks!