Best Of
Re: Return an array from the UDF
Have a look at the matrix_multiply example under /opt/vertica/sdk/examples/python/ScalarFunctions.py
This is a Python example, but Python SDK notionally wraps the C++ one.
Bryan_H
Re: Error VX001/3381 when migrating Vertica 23.4 from RHEL7.5 to RHEL8
RESOLVED
It turns out the issue was making sure that /var/vertica exists and that user dbadmin has read+write permissions. I had assumed that all Vertica files are contained under /opt/vertica/ and /vertica/data/
mkdir /var/vertica chown -R dbadmin:verticadba /var/vertica
The clue was in /vertica/data/protobi_db/v_$DATABASENAME_node0001_catalog/vertica.log and/vertica/data/protobi_db/v_$DATABASENAME_node0001_catalog/startup.log
Re: How many nodes are enough for 24 terabytes of data?
Yes, there are limits to how much a single node can handle. There are only so many slots for physical disks, and physical disk volumes have limits. Is it possible to put 24 TB into a single node? Yes, but I wouldn't recommend it. You're likely to get more bang for your buck with 3 or 4 nodes depending on hardware configurations. In order for one node to happily satisfy 24 TB of data, it would need to be a very expensive machine. A single node is also a single point of failure. With 3 nodes, you get K-safety.
A more reasonable option might be Eon mode using S3 storage. You can shove as much data as you want to into S3, and then apply as much compute as you need on the fly, or shut the whole thing down if you're not using it. So, Eon mode has a lot more flexibility here.
Re: Updating table statistics?
The info 'FULL' for the statistics_type means just one of the possible values: 'FULL','ROWCOUNT','NONE'. That also means that 'FULL' statistics can very well be stale.
Imagine you have a TIMESTAMP column named creation_ts, DEFAULT CURRENT_TIMESTAMP. If you ran ANALYZE_STATISTICS() on that column yesterday and today you inserted more rows - the statistics on that column will be full, but still, every query asking for today's data , say, WHERE creation_ts >= TRUNC(CURRENT_TIMESTAMP,'DD') AND creation_ts < TRUNC(CURRENT_TIMESTAMP,'DD') + 1 will search for data for which no statistics have yet been gathered. For that type of query, the table/column's statistics would be too stale to allow for an efficient query plan, no matter what the statistics_type for that column may be.
Re: How to map object name in storage location grant to actual storage location
@Sergey_Cherepan_1 : Sure I will request for it.
Re: Is there any way to run a user-defined function at DB startup (Enterprise mode)?
Vertica does not provide a built-in event-driven startup hook to execute scripts automatically when the database starts.
However, the provided Bash script can be used to detect when Vertica starts by monitoring the database logs and triggering a function.
cat vertica_startup_trigger.sh
#!/bin/bash
# Path to Vertica startup log file, found via: /opt/vertica/bin/admintools -t list_db -d YourDatabaseName
STARTUP_LOG="/CATALOG/YourDatabaseName/v_YourDatabaseName_node0001_catalog/startup.log"
# Function to trigger when Vertica starts
on_vertica_start() {
echo "Vertica started at: $(date)"
vsql -c "select 'CALL your_post_startup_procedure()' Vertica_DB_Startup_trigger from dual;" # Replace with the actual procedure
}
# Initial check in case Vertica is already running
if grep -q '"stage" : "Startup Complete",' "$STARTUP_LOG"; then
on_vertica_start
fi
# Watch for modifications on the startup log and trigger the script only when needed
inotifywait -m -e MODIFY "$STARTUP_LOG" --format "%w%f" |
while read FILE; do
if grep -q '"stage" : "Startup Complete",' "$FILE"; then
on_vertica_start
fi
done
Replace the "STARTUP_LOG" with your real Vertica startup log file path.
Vertica startup log file path is shown with: admintools -t list_db -d YourDatabaseName
One prerequisite is to ensure that the inotify-tools package is installed:
sudo apt-get install inotify-tools # Debian/Ubuntu sudo yum install inotify-tools # RHEL/CentOS
mosheg
Re: Where is vertica-client v 25.1
Please create a support case and we can share it with you.
Re: How to map object name in storage location grant to actual storage location
please check vs_storage_locations table.
Re: DROP_Partitions with Hierarchical partitioning takes forever
It's a hierarchical partitioning scheme. If you drop a recent partition, you can have a from_key equal to a to_key, and just that partition will be dropped.
If you drop older partitions, where you have a whole year in one partition, the from_key needs to be the first day, the to_key needs to be the last day of the year. Otherwise, a new partition is created, with everything except the stuff between from_key and to_key will be written into, before the partition is dropped. And that part - creating and filling a partition , takes forever. this is also why the last parameter of DROP_PARTITIONS() is an optional Boolean where you can state if you really want to do that ....
Re: capturing drop_partitions status
Please try the following:
-- Create simple partitioned table
CREATE TABLE t (c INT NOT NULL) PARTITION BY (c);
-- Insert initial data
INSERT INTO t SELECT 201907;
-- Merge partitions
SELECT do_tm_task('mergeout', 't');
-- View initial partition
SELECT partition_key
FROM partitions
WHERE projection_name = 't_super'
ORDER BY partition_key;
-- Update to create new partition
UPDATE t SET c = 202007;
-- Merge updated partitions
SELECT do_tm_task('mergeout', 't');
-- View both partitions
SELECT partition_key
FROM partitions
WHERE projection_name = 't_super'
ORDER BY partition_key;
\echo 'Create stored procedure for dropping partitions'
CREATE OR REPLACE PROCEDURE deletePartition(partition_to_drop INT)
LANGUAGE PLvSQL AS $$
DECLARE
drop_result VARCHAR(1000);
BEGIN
SELECT /*+label(drop_partitions_query)*/ DROP_Partitions(
'public.t',
partition_to_drop,
partition_to_drop
) INTO drop_result;
RAISE NOTICE 'DROP_Partitions result: %', drop_result;
EXCEPTION
WHEN OTHERS THEN
RAISE EXCEPTION 'Error in deletePartition: %', SQLERRM;
END;
$$;
\echo 'Test case 1: Should succeed - dropping old partition'
CALL deletePartition(201907);
\echo 'View partitions after successful drop'
SELECT partition_key
FROM partitions
WHERE projection_name = 't_super'
ORDER BY partition_key;
\echo 'Test case 2: Should fail - attempting to drop non-existent partition'
CALL deletePartition(201908);
Run time output:
partition_key
---------------
201907
202007
(2 rows)
Create stored procedure for dropping partitions
CREATE PROCEDURE
Test case 1: Should succeed - dropping old partition
vsql:test3.sql:52: NOTICE 2005: DROP_Partitions result: Partition dropped
deletePartition
-----------------
0
(1 row)
View partitions after successful drop
partition_key
---------------
202007
(1 row)
Test case 2: Should fail - attempting to drop non-existent partition
vsql:test3.sql:61: NOTICE 2005: DROP_Partitions result: Partition does not exist
deletePartition
-----------------
0
(1 row)
mosheg

