Best Of

Re: How many nodes are enough for 24 terabytes of data?

Yes, there are limits to how much a single node can handle. There are only so many slots for physical disks, and physical disk volumes have limits. Is it possible to put 24 TB into a single node? Yes, but I wouldn't recommend it. You're likely to get more bang for your buck with 3 or 4 nodes depending on hardware configurations. In order for one node to happily satisfy 24 TB of data, it would need to be a very expensive machine. A single node is also a single point of failure. With 3 nodes, you get K-safety.

A more reasonable option might be Eon mode using S3 storage. You can shove as much data as you want to into S3, and then apply as much compute as you need on the fly, or shut the whole thing down if you're not using it. So, Eon mode has a lot more flexibility here.

Vertica_Curtis

1

Re: Updating table statistics?

The info 'FULL' for the statistics_type means just one of the possible values: 'FULL','ROWCOUNT','NONE'. That also means that 'FULL' statistics can very well be stale.

Imagine you have a TIMESTAMP column named creation_ts, DEFAULT CURRENT_TIMESTAMP. If you ran ANALYZE_STATISTICS() on that column yesterday and today you inserted more rows - the statistics on that column will be full, but still, every query asking for today's data , say, WHERE creation_ts >= TRUNC(CURRENT_TIMESTAMP,'DD') AND creation_ts < TRUNC(CURRENT_TIMESTAMP,'DD') + 1 will search for data for which no statistics have yet been gathered. For that type of query, the table/column's statistics would be too stale to allow for an efficient query plan, no matter what the statistics_type for that column may be.

marcothesane

1

Re: How to map object name in storage location grant to actual storage location

@Sergey_Cherepan_1 : Sure I will request for it.

SruthiA

1

Re: Is there any way to run a user-defined function at DB startup (Enterprise mode)?

Vertica does not provide a built-in event-driven startup hook to execute scripts automatically when the database starts.
However, the provided Bash script can be used to detect when Vertica starts by monitoring the database logs and triggering a function.

cat vertica_startup_trigger.sh

#!/bin/bash

# Path to Vertica startup log file, found via: /opt/vertica/bin/admintools -t list_db -d YourDatabaseName
STARTUP_LOG="/CATALOG/YourDatabaseName/v_YourDatabaseName_node0001_catalog/startup.log"

# Function to trigger when Vertica starts
on_vertica_start() {
    echo "Vertica started at: $(date)"
    vsql -c "select 'CALL your_post_startup_procedure()' Vertica_DB_Startup_trigger from dual;"  # Replace with the actual procedure
}

# Initial check in case Vertica is already running
if grep -q '"stage" : "Startup Complete",' "$STARTUP_LOG"; then
    on_vertica_start
fi

# Watch for modifications on the startup log and trigger the script only when needed
inotifywait -m -e MODIFY "$STARTUP_LOG" --format "%w%f" |
while read FILE; do
    if grep -q '"stage" : "Startup Complete",' "$FILE"; then
        on_vertica_start
    fi
done

Replace the "STARTUP_LOG" with your real Vertica startup log file path.
Vertica startup log file path is shown with: admintools -t list_db -d YourDatabaseName

One prerequisite is to ensure that the inotify-tools package is installed:

sudo apt-get install inotify-tools   # Debian/Ubuntu
sudo yum install inotify-tools         # RHEL/CentOS

mosheg

1

Re: Where is vertica-client v 25.1

Please create a support case and we can share it with you.

SruthiA

1

Re: How to map object name in storage location grant to actual storage location

please check vs_storage_locations table.

SruthiA

1

Re: DROP_Partitions with Hierarchical partitioning takes forever

It's a hierarchical partitioning scheme. If you drop a recent partition, you can have a from_key equal to a to_key, and just that partition will be dropped.

If you drop older partitions, where you have a whole year in one partition, the from_key needs to be the first day, the to_key needs to be the last day of the year. Otherwise, a new partition is created, with everything except the stuff between from_key and to_key will be written into, before the partition is dropped. And that part - creating and filling a partition , takes forever. this is also why the last parameter of DROP_PARTITIONS() is an optional Boolean where you can state if you really want to do that ....

marcothesane

2

Re: capturing drop_partitions status

Please try the following:

-- Create simple partitioned table
CREATE TABLE t (c INT NOT NULL) PARTITION BY (c);

-- Insert initial data
INSERT INTO t SELECT 201907;

-- Merge partitions
SELECT do_tm_task('mergeout', 't');

-- View initial partition
SELECT partition_key
FROM partitions
WHERE projection_name = 't_super'
ORDER BY partition_key;

-- Update to create new partition
UPDATE t SET c = 202007;

-- Merge updated partitions
SELECT do_tm_task('mergeout', 't');

-- View both partitions
SELECT partition_key
FROM partitions
WHERE projection_name = 't_super'
ORDER BY partition_key;

\echo 'Create stored procedure for dropping partitions'
CREATE OR REPLACE PROCEDURE deletePartition(partition_to_drop INT)
LANGUAGE PLvSQL AS $$
DECLARE
  drop_result VARCHAR(1000);
BEGIN
  SELECT /*+label(drop_partitions_query)*/ DROP_Partitions(
    'public.t',
    partition_to_drop,
    partition_to_drop
  ) INTO drop_result;

  RAISE NOTICE 'DROP_Partitions result: %', drop_result;

EXCEPTION
  WHEN OTHERS THEN
    RAISE EXCEPTION 'Error in deletePartition: %', SQLERRM;
END;
$$;

\echo 'Test case 1: Should succeed - dropping old partition'
CALL deletePartition(201907);

\echo 'View partitions after successful drop'
SELECT partition_key
FROM partitions
WHERE projection_name = 't_super'
ORDER BY partition_key;

\echo 'Test case 2: Should fail - attempting to drop non-existent partition'
CALL deletePartition(201908);

Run time output:

 partition_key
---------------
 201907
 202007
(2 rows)

Create stored procedure for dropping partitions
CREATE PROCEDURE
Test case 1: Should succeed - dropping old partition
vsql:test3.sql:52: NOTICE 2005:  DROP_Partitions result: Partition dropped
 deletePartition
-----------------
               0
(1 row)

View partitions after successful drop
 partition_key
---------------
 202007
(1 row)

Test case 2: Should fail - attempting to drop non-existent partition
vsql:test3.sql:61: NOTICE 2005:  DROP_Partitions result: Partition does not exist
 deletePartition
-----------------
               0
(1 row)

mosheg

1

Re: Help to understand the slow running qury

Query 1 and Query 2 are not joining on the same column:

Q1 where
a.netif_unique_id = c.netif_unique_id
and a.netif_unique_id in ('d946b79e-7d9a-4575-91c3-893172f32a5b' )

Q2 where
a.node_unique_id = c.node_unique_id
and a.netif_unique_id in ('d946b79e-7d9a-4575-91c3-893172f32a5b' )

SergeB

1

Re: How to find not up-to-date projection

I missed column is_up_to_date in projection systems table.

scherepanov

1

We're Moving!

Create My New Community Account Now

Best Of

Re: How many nodes are enough for 24 terabytes of data?

Re: Updating table statistics?

Re: How to map object name in storage location grant to actual storage location

Re: Is there any way to run a user-defined function at DB startup (Enterprise mode)?

Re: Where is vertica-client v 25.1

Re: How to map object name in storage location grant to actual storage location

Re: DROP_Partitions with Hierarchical partitioning takes forever

Re: capturing drop_partitions status

Re: Help to understand the slow running qury

Re: How to find not up-to-date projection