Database Designer

Can any one explain me how to write frequent queries(format) in .sql file to give database designer? With an example.

Comments

  • Jim_KnicelyJim_Knicely - Select Field - Administrator
    edited November 2017

    You can query the DC_REQUESTS_ISSUED tables. Something like this:

    [dbadmin@s18384357 ~]$ vsql -Atc "select request from dc_requests_issued where request_type = 'QUERY' and time >= trunc(sysdate) - 7 and right(request, 1) = ';' and user_name <> 'dbadmin' group by request having count(*) > 5 limit 100;" -o /home/dbadmin/frequent_queries.sql
    
    [dbadmin@s18384357 ~]$ cat /home/dbadmin/frequent_queries.sql
    select c from il;
    select count(*) from jim.irIS;
    

    So the I got at most 100 of the queries executed by all users other than "dbadmin" more than 5 times in the previous week and stuck them in the file "/home/dbadmin/frequent_queries.sql".

    Now I can pass this file to DBD!

  • Thank you jim.

  • Jim,I need pre join projection of normal table with flex tables.Can it possible?If it is not then what is the alternative solution to get best query performance?.Thanks in advance.

  • Jim_KnicelyJim_Knicely - Select Field - Administrator
    edited November 2017

    Hi,

    If you have a question on a separate topic, can you please create a new post?

    Anyway, wow, I've never seen that question asked before :)

    Fyi ... Pre-join projections have been deprecated in Vertica 8.0.

    See:
    https://my.vertica.com/docs/9.0.x/HTML/index.htm#Authoring/NewFeatures/_VersionIndependent/DeprecatedandRetiredFunctionality.htm

    Although as of Vertica 9.0 you can still create pre-join projections, those that involve a FLEX table are not supported...

    Example:

    dbadmin=> SELECT table_name, table_schema, is_flextable FROM v_catalog.tables WHERE table_name IN ('cars', 'names');;
     table_name | table_schema | is_flextable
    ------------+--------------+--------------
     names      | public       | f
     cars       | public       | t
    (2 rows)
    
    dbadmin=> CREATE PROJECTION cars_names_prejoin AS SELECT maplookup(cars.__raw__, 'name') name1, names.name name2 FROM cars JOIN names ON maplookup(cars.__raw__, 'name') =  names.name;
    ERROR 5600:  Invalid predicate in projection-select. Only PK=FK equijoins are allowed
    
    dbadmin=> ALTER TABLE names ADD CONSTRAINT names_pk PRIMARY KEY (name);
    WARNING 2623:  Column "name" definition changed to NOT NULL
    WARNING 4887:  Table names has data. Queries using this table may give wrong results if the data does not satisfy this constraint
    HINT:  Use analyze_constraints() to check constraint violation on data
    ALTER TABLE
    
    dbadmin=> ALTER TABLE cars ADD COLUMN name VARCHAR(10) NOT NULL DEFAULT name::VARCHAR(10);
    ALTER TABLE
    
    dbadmin=> ALTER TABLE cars ADD CONSTRAINT cars_fk1 FOREIGN KEY (name) REFERENCES names(name);
    WARNING 4887:  Table cars has data. Queries using this table may give wrong results if the data does not satisfy this constraint
    HINT:  Use analyze_constraints() to check constraint violation on data
    ALTER TABLE
    
    dbadmin=> CREATE PROJECTION cars_names_prejoin AS SELECT cars.name name1, names.name name2 FROM cars JOIN names ON cars.name = names.name;
    WARNING 7399:  Created prejoin projection 'cars_names_prejoin'. Prejoin projections have been deprecated and will be removed in a future version
    ROLLBACK 6092:  Unsupported access to flex table: No PREJOIN PROJECTION support
    

    Note that final statement: "Unsupported access to flex table: No PREJOIN PROJECTION support"

  • Thank you jim. But I need that type of requirement.Is there any way alternative to get better query performance of normal table with flex table?

  • Jim_KnicelyJim_Knicely - Select Field - Administrator
    edited November 2017

    The only way that I can think of to get better performance out of a FLEX table is to materialize them:

    See:
    https://my.vertica.com/docs/9.0.x/HTML/index.htm#Authoring/FlexTables/MaterializingFlexTables.htm

    Refer to the Note:
    Materializing virtual columns by promoting them to real columns can significantly improve query performance. Vertica recommends that you materialize important virtual columns before running large and complex queries. Promoted columns cause a small decrease in load performance.

  • Thank you jim.

  • If you're on a recent version of Vertica, try out flattened tables, our replacement for prejoins.
    https://my.vertica.com/docs/8.1.x/HTML/index.htm#Authoring/AnalyzingData/FlattenedTables/FlattenedTableCreate.htm?TocPath=Analyzing%20Data|Flattened%20Tables|_____2

    An example:
    create flex table bar();
    create table foo (a int, b int default (select whatever::!int from bar where bar.b = foo.a));

  • sreeblrsreeblr - Select Field - Employee

    Hi Jim, should we have only request_type query or even load is useful?

  • Hello, Ben! I'm having a similar problem here. I have a fact table with millions of registers with some dimensions. And I need to improve a performance of a query, initially I was trying to create a projection and end up with the error "Prejoin projection is deprecated", then I looked up more information about flattened tables that you suggested as a solution.
    However, I have some doubts about this:

    1- I was planning on creating a flattened table that used the fact table and the dimensions, but do not alter them, and them be able to create other flattened tables to aggregate data from fact and dimension tables.

    2- Is there a way to create a flattened table totally based on the query I was using to create a projection? I was expecting something like create table fact_wide set using [query], is there a way to do it?

    3- I only found 2 suggestions of use of flattened tables, creating a table based on other (like: CREATE TABLE fact_wide as select a, b,c from fact; and later add dimension columns one by one with a query related), or altering the fact table adding the dimension columns. Then I come to questions, if I create a flattened table like fact_wide, how the fact data can be refreshed on fact_wide?

    4- Also if I alter the fact table (which I'm not sure is a good practice), do I really have to add a query column by column even if its the same dimension table (in my case there are a lot of columns in each dimension)

  • I created another discussion to talk focused on "Prejoin Project is deprecated", flattened tables or other solutions for this deprecated item:

    https://forum.vertica.com/discussion/239997/prejoin-projection-is-deprecated-flattened-tables/p1?new=1

  • i have a doubt when i am using
    SELECT time,user_name,transaction_id,request_type,request FROM dc_requests_issued
    where request_type='QUERY'
    order by time asc
    it is giving me the list of queries that i have fired not by all the users

  • Jim_KnicelyJim_Knicely - Select Field - Administrator
    edited February 2019

    @AkshayBarakoti - Data Collector table data is stored in flat files in the DataCollector directory on each node. Each node stores its own data and each node has a policy (typically based on size) that determines how long the data is retained. If users are connecting to different nodes there is a potential that the data on each node might not go as far back in history as one another. Especially if most users connect to one node. The query history on that node would roll off more quickly than the other nodes giving the appearance that older queries for those users are not being retained

    Quick example:

    I have 3 nodes:

    [dbadmin ~]$ vsql -ic "SELECT node_name, node_address FROM nodes;"
    Timing is off.
        node_name    |  node_address
    -----------------+----------------
    v_mydb_node0001 | 192.168.61.227
    v_mydb_node0002 | 192.168.61.228
    v_mydb_node0003 | 192.168.61.229
    (3 rows)
    

    Set the disk space retention policy for the Data Collector component RequestsIssued (the DC_REQUESTS_ISSUED table) to 2K and clear:

    [dbadmin ~]$ vsql -ic "SELECT set_data_collector_policy('RequestsIssued', '1500', '2');"
    Timing is off.
    set_data_collector_policy
    ---------------------------
    SET
    (1 row)
    
    [dbadmin ~]$ vsql -ic "SELECT clear_data_collector('RequestsIssued');" -d mydb
    Timing is off.
    clear_data_collector
    ----------------------
    CLEAR
    (1 row)
    

    Now run a query on node 1 as user JIM and JANE, and 1 query on node 2 as user JANE:

    [dbadmin ~]$ vsql -U jim -h 192.168.61.227 -ic "SELECT * FROM dual;" -d mydb #node1
    Timing is off.
    dummy
    -------
    X
    (1 row)
    
    [dbadmin ~]$ vsql -U jane -h 192.168.61.227 -ic "SELECT * FROM dual;" -d mydb #node1
    Timing is off.
    dummy
    -------
    X
    (1 row)
    
    [dbadmin ~]$ vsql -U jane-h 192.168.61.228 -ic "SELECT * FROM dual;" -d mydb #node2
    Timing is off.
    dummy
    -------
    X
    (1 row)
    

    I have short SQL statement in a file that shows the disk usage for the Data Collector table, including the users JIM and JANE to show they still have query info stored:

    [dbadmin ~]$ cat summary.sql
    SELECT DISTINCT ri.node_name, ri.user_name, dc.current_disk_records, dc.current_disk_bytes
      FROM data_collector dc
      JOIN dc_requests_issued ri
        ON ri.node_name = dc.node_name
    WHERE dc.table_name = 'dc_requests_issued'
       AND ri.user_name IN ('jim', 'jane');
    
    [dbadmin ~]$ vsql -if summary.sql
    Timing is off.
        node_name    | user_name | current_disk_records | current_disk_bytes
    -----------------+-----------+----------------------+--------------------
    v_mydb_node0001 | jane      |                    4 |               1834
    v_mydb_node0001 | jim       |                    4 |               1834
    v_mydb_node0002 | jane      |                    2 |                768
    (3 rows)
    

    Now I will run the disk usage query several times on node one until the disk usage exceeds 2K:

    [dbadmin ~]$ vsql -if summary.sql
    Timing is off.
        node_name    | user_name | current_disk_records | current_disk_bytes
    -----------------+-----------+----------------------+--------------------
    v_mydb_node0001 | jane      |                    4 |               2035
    v_mydb_node0001 | jim       |                    4 |               2035
    v_mydb_node0002 | jane      |                    2 |                768
    (3 rows)
    
    [dbadmin ~]$ vsql -if summary.sql
    Timing is off.
        node_name    | user_name | current_disk_records | current_disk_bytes
    -----------------+-----------+----------------------+--------------------
    v_mydb_node0002 | jane      |                    2 |                768
    (1 row)
    

    See, I “lost” history for user JIM but not user JANE…

    A Possible Solution:

    To maintain a history for some period of time (i.e. 6 days), then set a time capacity for the RequestsIssued data Collector tables on all nodes:

    [dbadmin ~]$ vsql -ic "SELECT set_data_collector_time_policy('RequestsIssued', '6 days'::interval);"
    Timing is off.
    set_data_collector_time_policy
    --------------------------------
    SET
    (1 row)
    

    For more info, checkout:
    https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/SQLReferenceManual/Functions/VerticaFunctions/DataCollection/SET_DATA_COLLECTOR_TIME_POLICY.htm
    https://forum.vertica.com/discussion/240231/keep-data-collector-information-for-a-set-interval

  • Jim_KnicelyJim_Knicely - Select Field - Administrator
    edited February 2019

    @AkshayBarakoti - One more thing... Are you running the query as the DBADMIN user?

    Non admin users only see their own data in the system tables by default:

    dbadmin=> SELECT count(*) FROM dc_requests_issued WHERE request_type='QUERY';
     count
    -------
      2528
    1 row)
    
    dbadmin=> \c - testuser
    You are now connected as user "testuser".
    
    dbadmin=> SELECT count(*) FROM dc_requests_issued WHERE request_type='QUERY';
     count
    -------
         2
    
  • @Jim_Knicely thanks man really help actually i was running the query with my user but when i switched to dbadmin it works
    one more thing i am from sql server (Microsoft Transact SQL ) background and facing issues with vertica concepts and syntax.it would be a great help if you give me some quick links to learn vertica from scratch

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file