Discussion:
PgAgent daemon terminated itself when it was running a time-consuming job.
Eugene Wang
2018-09-06 16:19:29 UTC
Permalink
My environment is Ubuntu 16.04 LTS, Postgres 9.6.8 with PgAgent 4.0
installed and setup successfully.

I tried to use a PgAgent Job to pull data from a MySQL table using FDW on a
routine basis in order to keep a updated copy of the mysql table. This job
has two steps. When I run those steps directly, the first one would take
less than 30 mins and the second one usually takes 44 minutes. However,
when I try to run it with PgAgent, my logs showed me that only the first
step was finished, and PgAgent daemon was terminated during this job and it
would never execute the second step.

Is there a solution on this issue? If it is beyond the capability of
PgAgent, what solution would help me? the system cron? I really like the
feature of PgAgent that steps could be executed one by one if the previous
is done, so I would hope I could just set up some configuration and get
this fixed.

Thank you.


Yujie Wang (Welcome to call me Eugene, for your pronunciation convenience)
GIS Developer
Connected2fiber
Linkedin : https://www.linkedin.com/in/yujie-wang-61a129a0/
Eugene Wang
2018-09-06 18:00:13 UTC
Permalink
My environment is Ubuntu 16.04 LTS, Postgres 9.6.8 with PgAgent 4.0
installed and setup successfully.

I tried to use a PgAgent Job to pull data from a MySQL table using FDW on a
routine basis in order to keep a updated copy of the mysql table. This job
has two steps. When I run those steps directly, the first one would take
less than 30 mins and the second one usually takes 44 minutes. However,
when I try to run it with PgAgent, my logs showed me that only the first
step was finished, and PgAgent daemon was terminated during this job and it
would never execute the second step.

Is there a solution on this issue? If it is beyond the capability of
PgAgent, what solution would help me? the system cron? I really like the
feature of PgAgent that steps could be executed one by one if the previous
is done, so I would hope I could just set up some configuration and get
this fixed.

Thank you.


Yujie Wang (Welcome to call me Eugene, for your pronunciation convenience)
GIS Developer
Connected2fiber
Linkedin : https://www.linkedin.com/in/yujie-wang-61a129a0/
Neel Patel
2018-09-07 04:37:32 UTC
Permalink
Hi,

Can you please share the pgAgent logs in debug mode ? If possible, also
share the job details or screenshots so that we can reproduce at our end.

Thanks,
Neel Patel
Post by Eugene Wang
My environment is Ubuntu 16.04 LTS, Postgres 9.6.8 with PgAgent 4.0
installed and setup successfully.
I tried to use a PgAgent Job to pull data from a MySQL table using FDW on
a routine basis in order to keep a updated copy of the mysql table. This
job has two steps. When I run those steps directly, the first one would
take less than 30 mins and the second one usually takes 44 minutes.
However, when I try to run it with PgAgent, my logs showed me that only the
first step was finished, and PgAgent daemon was terminated during this job
and it would never execute the second step.
Is there a solution on this issue? If it is beyond the capability of
PgAgent, what solution would help me? the system cron? I really like the
feature of PgAgent that steps could be executed one by one if the previous
is done, so I would hope I could just set up some configuration and get
this fixed.
Thank you.
Yujie Wang (Welcome to call me Eugene, for your pronunciation
convenience)
GIS Developer
Connected2fiber
Linkedin : https://www.linkedin.com/in/yujie-wang-61a129a0/
Eugene Wang
2018-11-13 21:54:22 UTC
Permalink
Hi Neel and PgAdmin,

Finally, I have come back on this issue now. Thank you Neel for replying. I
tried to reproduce this issue in debug mode in the first time, but it then
got passed. Now, in the second time, I am able to reproduce this error in
DEBUG modes. Before I showed you the logs, let me briefly describe what the
PgAgent job is doing.

PgAgent Job Description: I want to periodically synchronize a few tables
stored in MySQL to PostgreSQL. Now I am using mysql_fdw FDW plugin to let
Postgres talk to MySQL. I wrote a few stored procedures for each one of
those tables in order to pull, index and rename new tables and delete old
tables. These functions worked when I ran them without PgAgent. Since these
databases are proprietary to our company, I am sorry that I am not able to
make a test case for reproduction, but, as you can imagine, they are simple
functions except that one or two have 10M+ records but I do not think it
will be an issue in this case.

In this PgAgent Job, I have three steps: 1st step is calling 4 functions to
synchronize 4 tables; 2nd step is calling 1 function to synchronize a large
table named as "md_location", and 3rd step is to vacuum and analyze those
tables. The problem

The problem happened at the beginning of the 2nd step.

PgAgent Logs I have deleted meaningless repetitive logs:):

Tue Nov 13 11:26:58 2018 DEBUG: Sleeping...

Tue Nov 13 11:27:03 2018 DEBUG: Clearing inactive connections

Tue Nov 13 11:27:03 2018 DEBUG: Connection stats: total - 1, free - 0,
deleted - 0

Tue Nov 13 11:27:03 2018 DEBUG: Checking for jobs to run

Tue Nov 13 11:27:03 2018 DEBUG: Creating job thread for job 6

Tue Nov 13 11:27:03 2018 DEBUG: Destroying job thread for job 6

Tue Nov 13 11:27:03 2018 DEBUG: Destroying job thread for job 6

Tue Nov 13 11:27:03 2018 DEBUG: Destroying job thread for job 6

Tue Nov 13 11:27:03 2018 DEBUG: Sleeping...

Tue Nov 13 11:27:03 2018 DEBUG: Creating DB connection:
user=pgagent_user hostaddr=127.0.0.1
port=5432 dbname=postgres

Tue Nov 13 11:27:03 2018 DEBUG: Parsing connection information...

Tue Nov 13 11:27:03 2018 DEBUG: Allocating new connection for the database
with connection string: user=pgagent_user dbname=postgres
hostaddr=127.0.0.1 port=5432...

Tue Nov 13 11:27:03 2018 DEBUG: Starting job: 6

Tue Nov 13 11:27:03 2018 DEBUG: Creating DB connection:
user=pgagent_user hostaddr=127.0.0.1
port=5432 dbname=gis_dev

Tue Nov 13 11:27:03 2018 DEBUG: Parsing connection information...

Tue Nov 13 11:27:03 2018 DEBUG: Allocating new connection for the database
with connection string: user=pgagent_user dbname=gis_dev hostaddr=127.0.0.1
port=5432...

Tue Nov 13 11:27:03 2018 DEBUG: Executing SQL step 10(part of job 6)

NOTICE: table "md_location_classification_raw" does not exist, skipping

Tue Nov 13 11:29:03 2018 DEBUG: Sleeping...

NOTICE: table "md_location_classification_type_raw" does not exist,
skipping

Tue Nov 13 11:30:28 2018 DEBUG: Sleeping...

NOTICE: identifier
"md_location_classification_type_local_md_location_classification_id_idx"
will be truncated to
"md_location_classification_type_local_md_location_classificatio"

Tue Nov 13 11:30:38 2018 DEBUG: Sleeping...

NOTICE: table "md_entity_raw" does not exist, skipping

Tue Nov 13 11:40:58 2018 DEBUG: Sleeping...

NOTICE: table "md_entity_location_raw" does not exist, skipping

Tue Nov 13 11:50:14 2018 DEBUG: Sleeping...

Tue Nov 13 11:50:17 2018 DEBUG: Parsing connection information...

Tue Nov 13 11:50:17 2018 DEBUG: Returning the connection to the connection
pool: 'user=pgagent_user dbname=gis_dev hostaddr=127.0.0.1 port=5432'...

Tue Nov 13 11:50:17 2018 DEBUG: Parsing connection information...

Tue Nov 13 11:50:17 2018 DEBUG: Using the existing connection
'user=pgagent_user dbname=gis_dev hostaddr=127.0.0.1 port=5432'...

Tue Nov 13 11:50:17 2018 DEBUG: Executing SQL step 11(part of job 6)

Tue Nov 13 11:50:18 2018 WARNING: Query error: PANIC: ERRORDATA_STACK_SIZE
exceeded

CONTEXT: SQL statement "import foreign schema schema_abc

limit to (md_location)

from SERVER c2f_prod

into fd_mysql"

PL/pgSQL function eugene.extract_md_location_raw() line 6 at SQL statement

SSL SYSCALL error: EOF detected


Tue Nov 13 11:50:18 2018 DEBUG: Parsing connection information...

Tue Nov 13 11:50:18 2018 DEBUG: Returning the connection to the connection
pool: 'user=pgagent_user dbname=gis_dev hostaddr=127.0.0.1 port=5432'...

WARNING: terminating connection because of crash of another server process

DETAIL: The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.

HINT: In a moment you should be able to reconnect to the database and
repeat your command.

Tue Nov 13 11:50:18 2018 WARNING: Query error: server closed the connection
unexpectedly

This probably means the server terminated abnormally

before or while processing the request.


Tue Nov 13 11:50:18 2018 DEBUG: Parsing connection information...

Tue Nov 13 11:50:18 2018 DEBUG: Returning the connection to the connection
pool: 'user=pgagent_user dbname=postgres hostaddr=127.0.0.1 port=5432'...

Tue Nov 13 11:50:18 2018 DEBUG: Completed job: 6

Tue Nov 13 11:50:18 2018 DEBUG: Destroying job thread for job 6

Tue Nov 13 11:50:19 2018 DEBUG: Clearing inactive connections

Tue Nov 13 11:50:19 2018 DEBUG: Connection stats: total - 3, free - 2,
deleted - 2

Tue Nov 13 11:50:19 2018 DEBUG: Checking for jobs to run

WARNING: terminating connection because of crash of another server process

DETAIL: The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.

HINT: In a moment you should be able to reconnect to the database and
repeat your command.

Tue Nov 13 11:50:19 2018 WARNING: Query error: server closed the connection
unexpectedly

This probably means the server terminated abnormally

before or while processing the request.


Tue Nov 13 11:50:19 2018 ERROR: Failed to query jobs table!

Thank you.

Yujie Wang (Welcome to call me Eugene, for your pronunciation convenience)
GIS Developer
Connected2fiber
Linkedin : https://www.linkedin.com/in/yujie-wang-61a129a0/
Post by Neel Patel
Hi,
Can you please share the pgAgent logs in debug mode ? If possible, also
share the job details or screenshots so that we can reproduce at our end.
Thanks,
Neel Patel
Post by Eugene Wang
My environment is Ubuntu 16.04 LTS, Postgres 9.6.8 with PgAgent 4.0
installed and setup successfully.
I tried to use a PgAgent Job to pull data from a MySQL table using FDW on
a routine basis in order to keep a updated copy of the mysql table. This
job has two steps. When I run those steps directly, the first one would
take less than 30 mins and the second one usually takes 44 minutes.
However, when I try to run it with PgAgent, my logs showed me that only the
first step was finished, and PgAgent daemon was terminated during this job
and it would never execute the second step.
Is there a solution on this issue? If it is beyond the capability of
PgAgent, what solution would help me? the system cron? I really like the
feature of PgAgent that steps could be executed one by one if the previous
is done, so I would hope I could just set up some configuration and get
this fixed.
Thank you.
Yujie Wang (Welcome to call me Eugene, for your pronunciation
convenience)
GIS Developer
Connected2fiber
Linkedin : https://www.linkedin.com/in/yujie-wang-61a129a0/
Neel Patel
2018-11-15 05:18:05 UTC
Permalink
Hi,

From the logs, job is executing but it looks like some schema is missing
while executing job. See below logs.

NOTICE: table "md_entity_raw" does not exist, skipping

Tue Nov 13 11:40:58 2018 DEBUG: Sleeping...

NOTICE: table "md_entity_location_raw" does not exist, skipping

Is it exists ? If not, can you make sure all required schemas are available
during execution. OR share the sample job code sql with use to debug
further.

Thanks,
Neel Patel
Post by Eugene Wang
Hi Neel and PgAdmin,
Finally, I have come back on this issue now. Thank you Neel for replying.
I tried to reproduce this issue in debug mode in the first time, but it
then got passed. Now, in the second time, I am able to reproduce this error
in DEBUG modes. Before I showed you the logs, let me briefly describe what
the PgAgent job is doing.
PgAgent Job Description: I want to periodically synchronize a few tables
stored in MySQL to PostgreSQL. Now I am using mysql_fdw FDW plugin to let
Postgres talk to MySQL. I wrote a few stored procedures for each one of
those tables in order to pull, index and rename new tables and delete old
tables. These functions worked when I ran them without PgAgent. Since
these databases are proprietary to our company, I am sorry that I am not
able to make a test case for reproduction, but, as you can imagine, they
are simple functions except that one or two have 10M+ records but I do not
think it will be an issue in this case.
In this PgAgent Job, I have three steps: 1st step is calling 4 functions
to synchronize 4 tables; 2nd step is calling 1 function to synchronize a
large table named as "md_location", and 3rd step is to vacuum and analyze
those tables. The problem
The problem happened at the beginning of the 2nd step.
Tue Nov 13 11:26:58 2018 DEBUG: Sleeping...
Tue Nov 13 11:27:03 2018 DEBUG: Clearing inactive connections
Tue Nov 13 11:27:03 2018 DEBUG: Connection stats: total - 1, free - 0,
deleted - 0
Tue Nov 13 11:27:03 2018 DEBUG: Checking for jobs to run
Tue Nov 13 11:27:03 2018 DEBUG: Creating job thread for job 6
Tue Nov 13 11:27:03 2018 DEBUG: Destroying job thread for job 6
Tue Nov 13 11:27:03 2018 DEBUG: Destroying job thread for job 6
Tue Nov 13 11:27:03 2018 DEBUG: Destroying job thread for job 6
Tue Nov 13 11:27:03 2018 DEBUG: Sleeping...
Tue Nov 13 11:27:03 2018 DEBUG: Creating DB connection: user=pgagent_user
hostaddr=127.0.0.1 port=5432 dbname=postgres
Tue Nov 13 11:27:03 2018 DEBUG: Parsing connection information...
Tue Nov 13 11:27:03 2018 DEBUG: Allocating new connection for the database
with connection string: user=pgagent_user dbname=postgres
hostaddr=127.0.0.1 port=5432...
Tue Nov 13 11:27:03 2018 DEBUG: Starting job: 6
Tue Nov 13 11:27:03 2018 DEBUG: Creating DB connection: user=pgagent_user
hostaddr=127.0.0.1 port=5432 dbname=gis_dev
Tue Nov 13 11:27:03 2018 DEBUG: Parsing connection information...
Tue Nov 13 11:27:03 2018 DEBUG: Allocating new connection for the database
with connection string: user=pgagent_user dbname=gis_dev hostaddr=127.0.0.1
port=5432...
Tue Nov 13 11:27:03 2018 DEBUG: Executing SQL step 10(part of job 6)
NOTICE: table "md_location_classification_raw" does not exist, skipping
Tue Nov 13 11:29:03 2018 DEBUG: Sleeping...
NOTICE: table "md_location_classification_type_raw" does not exist,
skipping
Tue Nov 13 11:30:28 2018 DEBUG: Sleeping...
NOTICE: identifier
"md_location_classification_type_local_md_location_classification_id_idx"
will be truncated to
"md_location_classification_type_local_md_location_classificatio"
Tue Nov 13 11:30:38 2018 DEBUG: Sleeping...
NOTICE: table "md_entity_raw" does not exist, skipping
Tue Nov 13 11:40:58 2018 DEBUG: Sleeping...
NOTICE: table "md_entity_location_raw" does not exist, skipping
Tue Nov 13 11:50:14 2018 DEBUG: Sleeping...
Tue Nov 13 11:50:17 2018 DEBUG: Parsing connection information...
Tue Nov 13 11:50:17 2018 DEBUG: Returning the connection to the connection
pool: 'user=pgagent_user dbname=gis_dev hostaddr=127.0.0.1 port=5432'...
Tue Nov 13 11:50:17 2018 DEBUG: Parsing connection information...
Tue Nov 13 11:50:17 2018 DEBUG: Using the existing connection
'user=pgagent_user dbname=gis_dev hostaddr=127.0.0.1 port=5432'...
Tue Nov 13 11:50:17 2018 DEBUG: Executing SQL step 11(part of job 6)
Tue Nov 13 11:50:18 2018 WARNING: Query error: PANIC: ERRORDATA_STACK_SIZE
exceeded
CONTEXT: SQL statement "import foreign schema schema_abc
limit to (md_location)
from SERVER c2f_prod
into fd_mysql"
PL/pgSQL function eugene.extract_md_location_raw() line 6 at SQL statement
SSL SYSCALL error: EOF detected
Tue Nov 13 11:50:18 2018 DEBUG: Parsing connection information...
Tue Nov 13 11:50:18 2018 DEBUG: Returning the connection to the connection
pool: 'user=pgagent_user dbname=gis_dev hostaddr=127.0.0.1 port=5432'...
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back
the current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.
Tue Nov 13 11:50:18 2018 WARNING: Query error: server closed the
connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
Tue Nov 13 11:50:18 2018 DEBUG: Parsing connection information...
Tue Nov 13 11:50:18 2018 DEBUG: Returning the connection to the connection
pool: 'user=pgagent_user dbname=postgres hostaddr=127.0.0.1 port=5432'...
Tue Nov 13 11:50:18 2018 DEBUG: Completed job: 6
Tue Nov 13 11:50:18 2018 DEBUG: Destroying job thread for job 6
Tue Nov 13 11:50:19 2018 DEBUG: Clearing inactive connections
Tue Nov 13 11:50:19 2018 DEBUG: Connection stats: total - 3, free - 2,
deleted - 2
Tue Nov 13 11:50:19 2018 DEBUG: Checking for jobs to run
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back
the current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.
Tue Nov 13 11:50:19 2018 WARNING: Query error: server closed the
connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
Tue Nov 13 11:50:19 2018 ERROR: Failed to query jobs table!
Thank you.
Yujie Wang (Welcome to call me Eugene, for your pronunciation
convenience)
GIS Developer
Connected2fiber
Linkedin : https://www.linkedin.com/in/yujie-wang-61a129a0/
Post by Neel Patel
Hi,
Can you please share the pgAgent logs in debug mode ? If possible, also
share the job details or screenshots so that we can reproduce at our end.
Thanks,
Neel Patel
Post by Eugene Wang
My environment is Ubuntu 16.04 LTS, Postgres 9.6.8 with PgAgent 4.0
installed and setup successfully.
I tried to use a PgAgent Job to pull data from a MySQL table using FDW
on a routine basis in order to keep a updated copy of the mysql table. This
job has two steps. When I run those steps directly, the first one would
take less than 30 mins and the second one usually takes 44 minutes.
However, when I try to run it with PgAgent, my logs showed me that only the
first step was finished, and PgAgent daemon was terminated during this job
and it would never execute the second step.
Is there a solution on this issue? If it is beyond the capability of
PgAgent, what solution would help me? the system cron? I really like the
feature of PgAgent that steps could be executed one by one if the previous
is done, so I would hope I could just set up some configuration and get
this fixed.
Thank you.
Yujie Wang (Welcome to call me Eugene, for your pronunciation convenience)
GIS Developer
Connected2fiber
Linkedin : https://www.linkedin.com/in/yujie-wang-61a129a0/
Eugene Wang
2018-11-15 18:19:35 UTC
Permalink
Hi Neel,

Please just ignore the NOTICE messages, those were there for making sure no
table is using the same names of temporary table. Also, these messages were
issued in the first job step while the issue happened at the beginning of
the second job step. See below logs:

Tue Nov 13 11:50:17 2018 DEBUG: Executing SQL step 11(part of job 6)
Tue Nov 13 11:50:18 2018 WARNING: Query error: PANIC: ERRORDATA_STACK_SIZE
exceeded

I can make sure that all required schemas are available during execution,
and each function works perfectly when they are executed without PgAgent.

I want to share one suspicion but I cannot prove it. Since my stored
procedures have worked for a year, they should not be the problem. Also,
now the problem apparently happened right after the second job step begins.
Given the reproduced error message Query error: PANIC: ERRORDATA_STACK_SIZE
exceeded, I infer that the connection for the second job step was not a new
connection, at least some stack or memory stuffs were not cleaned after the
first job step finished. Do you think so?

Thanks,
Eugene
Post by Neel Patel
Hi,
From the logs, job is executing but it looks like some schema is missing
while executing job. See below logs.
NOTICE: table "md_entity_raw" does not exist, skipping
Tue Nov 13 11:40:58 2018 DEBUG: Sleeping...
NOTICE: table "md_entity_location_raw" does not exist, skipping
Is it exists ? If not, can you make sure all required schemas are
available during execution. OR share the sample job code sql with use to
debug further.
Thanks,
Neel Patel
Post by Eugene Wang
Hi Neel and PgAdmin,
Finally, I have come back on this issue now. Thank you Neel for replying.
I tried to reproduce this issue in debug mode in the first time, but it
then got passed. Now, in the second time, I am able to reproduce this error
in DEBUG modes. Before I showed you the logs, let me briefly describe what
the PgAgent job is doing.
PgAgent Job Description: I want to periodically synchronize a few tables
stored in MySQL to PostgreSQL. Now I am using mysql_fdw FDW plugin to let
Postgres talk to MySQL. I wrote a few stored procedures for each one of
those tables in order to pull, index and rename new tables and delete old
tables. These functions worked when I ran them without PgAgent. Since
these databases are proprietary to our company, I am sorry that I am not
able to make a test case for reproduction, but, as you can imagine, they
are simple functions except that one or two have 10M+ records but I do not
think it will be an issue in this case.
In this PgAgent Job, I have three steps: 1st step is calling 4 functions
to synchronize 4 tables; 2nd step is calling 1 function to synchronize a
large table named as "md_location", and 3rd step is to vacuum and analyze
those tables. The problem
The problem happened at the beginning of the 2nd step.
Tue Nov 13 11:26:58 2018 DEBUG: Sleeping...
Tue Nov 13 11:27:03 2018 DEBUG: Clearing inactive connections
Tue Nov 13 11:27:03 2018 DEBUG: Connection stats: total - 1, free - 0,
deleted - 0
Tue Nov 13 11:27:03 2018 DEBUG: Checking for jobs to run
Tue Nov 13 11:27:03 2018 DEBUG: Creating job thread for job 6
Tue Nov 13 11:27:03 2018 DEBUG: Destroying job thread for job 6
Tue Nov 13 11:27:03 2018 DEBUG: Destroying job thread for job 6
Tue Nov 13 11:27:03 2018 DEBUG: Destroying job thread for job 6
Tue Nov 13 11:27:03 2018 DEBUG: Sleeping...
Tue Nov 13 11:27:03 2018 DEBUG: Creating DB connection: user=pgagent_user
hostaddr=127.0.0.1 port=5432 dbname=postgres
Tue Nov 13 11:27:03 2018 DEBUG: Parsing connection information...
Tue Nov 13 11:27:03 2018 DEBUG: Allocating new connection for the
database with connection string: user=pgagent_user dbname=postgres
hostaddr=127.0.0.1 port=5432...
Tue Nov 13 11:27:03 2018 DEBUG: Starting job: 6
Tue Nov 13 11:27:03 2018 DEBUG: Creating DB connection: user=pgagent_user
hostaddr=127.0.0.1 port=5432 dbname=gis_dev
Tue Nov 13 11:27:03 2018 DEBUG: Parsing connection information...
Tue Nov 13 11:27:03 2018 DEBUG: Allocating new connection for the
database with connection string: user=pgagent_user dbname=gis_dev
hostaddr=127.0.0.1 port=5432...
Tue Nov 13 11:27:03 2018 DEBUG: Executing SQL step 10(part of job 6)
NOTICE: table "md_location_classification_raw" does not exist, skipping
Tue Nov 13 11:29:03 2018 DEBUG: Sleeping...
NOTICE: table "md_location_classification_type_raw" does not exist,
skipping
Tue Nov 13 11:30:28 2018 DEBUG: Sleeping...
NOTICE: identifier
"md_location_classification_type_local_md_location_classification_id_idx"
will be truncated to
"md_location_classification_type_local_md_location_classificatio"
Tue Nov 13 11:30:38 2018 DEBUG: Sleeping...
NOTICE: table "md_entity_raw" does not exist, skipping
Tue Nov 13 11:40:58 2018 DEBUG: Sleeping...
NOTICE: table "md_entity_location_raw" does not exist, skipping
Tue Nov 13 11:50:14 2018 DEBUG: Sleeping...
Tue Nov 13 11:50:17 2018 DEBUG: Parsing connection information...
Tue Nov 13 11:50:17 2018 DEBUG: Returning the connection to the
connection pool: 'user=pgagent_user dbname=gis_dev hostaddr=127.0.0.1
port=5432'...
Tue Nov 13 11:50:17 2018 DEBUG: Parsing connection information...
Tue Nov 13 11:50:17 2018 DEBUG: Using the existing connection
'user=pgagent_user dbname=gis_dev hostaddr=127.0.0.1 port=5432'...
Tue Nov 13 11:50:17 2018 DEBUG: Executing SQL step 11(part of job 6)
Tue Nov 13 11:50:18 2018 WARNING: Query error: PANIC: ERRORDATA_STACK_SIZE
exceeded
CONTEXT: SQL statement "import foreign schema schema_abc
limit to (md_location)
from SERVER c2f_prod
into fd_mysql"
PL/pgSQL function eugene.extract_md_location_raw() line 6 at SQL statement
SSL SYSCALL error: EOF detected
Tue Nov 13 11:50:18 2018 DEBUG: Parsing connection information...
Tue Nov 13 11:50:18 2018 DEBUG: Returning the connection to the
connection pool: 'user=pgagent_user dbname=gis_dev hostaddr=127.0.0.1
port=5432'...
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back
the current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.
Tue Nov 13 11:50:18 2018 WARNING: Query error: server closed the
connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
Tue Nov 13 11:50:18 2018 DEBUG: Parsing connection information...
Tue Nov 13 11:50:18 2018 DEBUG: Returning the connection to the
connection pool: 'user=pgagent_user dbname=postgres hostaddr=127.0.0.1
port=5432'...
Tue Nov 13 11:50:18 2018 DEBUG: Completed job: 6
Tue Nov 13 11:50:18 2018 DEBUG: Destroying job thread for job 6
Tue Nov 13 11:50:19 2018 DEBUG: Clearing inactive connections
Tue Nov 13 11:50:19 2018 DEBUG: Connection stats: total - 3, free - 2,
deleted - 2
Tue Nov 13 11:50:19 2018 DEBUG: Checking for jobs to run
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back
the current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.
Tue Nov 13 11:50:19 2018 WARNING: Query error: server closed the
connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
Tue Nov 13 11:50:19 2018 ERROR: Failed to query jobs table!
Thank you.
Yujie Wang (Welcome to call me Eugene, for your pronunciation
convenience)
GIS Developer
Connected2fiber
Linkedin : https://www.linkedin.com/in/yujie-wang-61a129a0/
Post by Neel Patel
Hi,
Can you please share the pgAgent logs in debug mode ? If possible, also
share the job details or screenshots so that we can reproduce at our end.
Thanks,
Neel Patel
Post by Eugene Wang
My environment is Ubuntu 16.04 LTS, Postgres 9.6.8 with PgAgent 4.0
installed and setup successfully.
I tried to use a PgAgent Job to pull data from a MySQL table using FDW
on a routine basis in order to keep a updated copy of the mysql table. This
job has two steps. When I run those steps directly, the first one would
take less than 30 mins and the second one usually takes 44 minutes.
However, when I try to run it with PgAgent, my logs showed me that only the
first step was finished, and PgAgent daemon was terminated during this job
and it would never execute the second step.
Is there a solution on this issue? If it is beyond the capability of
PgAgent, what solution would help me? the system cron? I really like the
feature of PgAgent that steps could be executed one by one if the previous
is done, so I would hope I could just set up some configuration and get
this fixed.
Thank you.
Yujie Wang (Welcome to call me Eugene, for your pronunciation convenience)
GIS Developer
Connected2fiber
Linkedin : https://www.linkedin.com/in/yujie-wang-61a129a0/
Neel Patel
2018-11-19 11:46:30 UTC
Permalink
Hi,
Post by Eugene Wang
Hi Neel,
Please just ignore the NOTICE messages, those were there for making sure
no table is using the same names of temporary table. Also, these messages
were issued in the first job step while the issue happened at the beginning
Tue Nov 13 11:50:17 2018 DEBUG: Executing SQL step 11(part of job 6)
Tue Nov 13 11:50:18 2018 WARNING: Query error: PANIC: ERRORDATA_STACK_SIZE
exceeded
I can make sure that all required schemas are available during execution,
and each function works perfectly when they are executed without PgAgent.
I want to share one suspicion but I cannot prove it. Since my stored
procedures have worked for a year, they should not be the problem. Also,
now the problem apparently happened right after the second job step begins.
Given the reproduced error message Query error: PANIC: ERRORDATA_STACK_SIZE
exceeded, I infer that the connection for the second job step was not a
new connection, at least some stack or memory stuffs were not cleaned after
the first job step finished. Do you think so?
Which version of database server are you using ?
Have you set any value "max_stack_depth" in postgresql.conf file ?
I have tried to reproduce with multiple job steps but not able to reproduce
the issue. This is something special with your job step. Can you please
share the sample example with us ? That will be useful to identify the root
cause of the problem.
Post by Eugene Wang
Thanks,
Eugene
Post by Neel Patel
Hi,
From the logs, job is executing but it looks like some schema is missing
while executing job. See below logs.
NOTICE: table "md_entity_raw" does not exist, skipping
Tue Nov 13 11:40:58 2018 DEBUG: Sleeping...
NOTICE: table "md_entity_location_raw" does not exist, skipping
Is it exists ? If not, can you make sure all required schemas are
available during execution. OR share the sample job code sql with use to
debug further.
Thanks,
Neel Patel
Post by Eugene Wang
Hi Neel and PgAdmin,
Finally, I have come back on this issue now. Thank you Neel for
replying. I tried to reproduce this issue in debug mode in the first time,
but it then got passed. Now, in the second time, I am able to reproduce
this error in DEBUG modes. Before I showed you the logs, let me briefly
describe what the PgAgent job is doing.
PgAgent Job Description: I want to periodically synchronize a few
tables stored in MySQL to PostgreSQL. Now I am using mysql_fdw FDW plugin
to let Postgres talk to MySQL. I wrote a few stored procedures for each one
of those tables in order to pull, index and rename new tables and delete
old tables. These functions worked when I ran them without PgAgent.
Since these databases are proprietary to our company, I am sorry that I am
not able to make a test case for reproduction, but, as you can imagine,
they are simple functions except that one or two have 10M+ records but I do
not think it will be an issue in this case.
In this PgAgent Job, I have three steps: 1st step is calling 4
functions to synchronize 4 tables; 2nd step is calling 1 function to
synchronize a large table named as "md_location", and 3rd step is to vacuum
and analyze those tables. The problem
The problem happened at the beginning of the 2nd step.
Tue Nov 13 11:26:58 2018 DEBUG: Sleeping...
Tue Nov 13 11:27:03 2018 DEBUG: Clearing inactive connections
Tue Nov 13 11:27:03 2018 DEBUG: Connection stats: total - 1, free - 0,
deleted - 0
Tue Nov 13 11:27:03 2018 DEBUG: Checking for jobs to run
Tue Nov 13 11:27:03 2018 DEBUG: Creating job thread for job 6
Tue Nov 13 11:27:03 2018 DEBUG: Destroying job thread for job 6
Tue Nov 13 11:27:03 2018 DEBUG: Destroying job thread for job 6
Tue Nov 13 11:27:03 2018 DEBUG: Destroying job thread for job 6
Tue Nov 13 11:27:03 2018 DEBUG: Sleeping...
Tue Nov 13 11:27:03 2018 DEBUG: Creating DB connection: user=pgagent_user
hostaddr=127.0.0.1 port=5432 dbname=postgres
Tue Nov 13 11:27:03 2018 DEBUG: Parsing connection information...
Tue Nov 13 11:27:03 2018 DEBUG: Allocating new connection for the
database with connection string: user=pgagent_user dbname=postgres
hostaddr=127.0.0.1 port=5432...
Tue Nov 13 11:27:03 2018 DEBUG: Starting job: 6
Tue Nov 13 11:27:03 2018 DEBUG: Creating DB connection: user=pgagent_user
hostaddr=127.0.0.1 port=5432 dbname=gis_dev
Tue Nov 13 11:27:03 2018 DEBUG: Parsing connection information...
Tue Nov 13 11:27:03 2018 DEBUG: Allocating new connection for the
database with connection string: user=pgagent_user dbname=gis_dev
hostaddr=127.0.0.1 port=5432...
Tue Nov 13 11:27:03 2018 DEBUG: Executing SQL step 10(part of job 6)
NOTICE: table "md_location_classification_raw" does not exist, skipping
Tue Nov 13 11:29:03 2018 DEBUG: Sleeping...
NOTICE: table "md_location_classification_type_raw" does not exist,
skipping
Tue Nov 13 11:30:28 2018 DEBUG: Sleeping...
NOTICE: identifier
"md_location_classification_type_local_md_location_classification_id_idx"
will be truncated to
"md_location_classification_type_local_md_location_classificatio"
Tue Nov 13 11:30:38 2018 DEBUG: Sleeping...
NOTICE: table "md_entity_raw" does not exist, skipping
Tue Nov 13 11:40:58 2018 DEBUG: Sleeping...
NOTICE: table "md_entity_location_raw" does not exist, skipping
Tue Nov 13 11:50:14 2018 DEBUG: Sleeping...
Tue Nov 13 11:50:17 2018 DEBUG: Parsing connection information...
Tue Nov 13 11:50:17 2018 DEBUG: Returning the connection to the
connection pool: 'user=pgagent_user dbname=gis_dev hostaddr=127.0.0.1
port=5432'...
Tue Nov 13 11:50:17 2018 DEBUG: Parsing connection information...
Tue Nov 13 11:50:17 2018 DEBUG: Using the existing connection
'user=pgagent_user dbname=gis_dev hostaddr=127.0.0.1 port=5432'...
Tue Nov 13 11:50:17 2018 DEBUG: Executing SQL step 11(part of job 6)
Tue Nov 13 11:50:18 2018 WARNING: Query error: PANIC: ERRORDATA_STACK_SIZE
exceeded
CONTEXT: SQL statement "import foreign schema schema_abc
limit to (md_location)
from SERVER c2f_prod
into fd_mysql"
PL/pgSQL function eugene.extract_md_location_raw() line 6 at SQL statement
SSL SYSCALL error: EOF detected
Tue Nov 13 11:50:18 2018 DEBUG: Parsing connection information...
Tue Nov 13 11:50:18 2018 DEBUG: Returning the connection to the
connection pool: 'user=pgagent_user dbname=gis_dev hostaddr=127.0.0.1
port=5432'...
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back
the current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.
Tue Nov 13 11:50:18 2018 WARNING: Query error: server closed the
connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
Tue Nov 13 11:50:18 2018 DEBUG: Parsing connection information...
Tue Nov 13 11:50:18 2018 DEBUG: Returning the connection to the
connection pool: 'user=pgagent_user dbname=postgres hostaddr=127.0.0.1
port=5432'...
Tue Nov 13 11:50:18 2018 DEBUG: Completed job: 6
Tue Nov 13 11:50:18 2018 DEBUG: Destroying job thread for job 6
Tue Nov 13 11:50:19 2018 DEBUG: Clearing inactive connections
Tue Nov 13 11:50:19 2018 DEBUG: Connection stats: total - 3, free - 2,
deleted - 2
Tue Nov 13 11:50:19 2018 DEBUG: Checking for jobs to run
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back
the current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.
Tue Nov 13 11:50:19 2018 WARNING: Query error: server closed the
connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
Tue Nov 13 11:50:19 2018 ERROR: Failed to query jobs table!
Thank you.
Yujie Wang (Welcome to call me Eugene, for your pronunciation convenience)
GIS Developer
Connected2fiber
Linkedin : https://www.linkedin.com/in/yujie-wang-61a129a0/
Post by Neel Patel
Hi,
Can you please share the pgAgent logs in debug mode ? If possible, also
share the job details or screenshots so that we can reproduce at our end.
Thanks,
Neel Patel
Post by Eugene Wang
My environment is Ubuntu 16.04 LTS, Postgres 9.6.8 with PgAgent 4.0
installed and setup successfully.
I tried to use a PgAgent Job to pull data from a MySQL table using FDW
on a routine basis in order to keep a updated copy of the mysql table. This
job has two steps. When I run those steps directly, the first one would
take less than 30 mins and the second one usually takes 44 minutes.
However, when I try to run it with PgAgent, my logs showed me that only the
first step was finished, and PgAgent daemon was terminated during this job
and it would never execute the second step.
Is there a solution on this issue? If it is beyond the capability of
PgAgent, what solution would help me? the system cron? I really like the
feature of PgAgent that steps could be executed one by one if the previous
is done, so I would hope I could just set up some configuration and get
this fixed.
Thank you.
Yujie Wang (Welcome to call me Eugene, for your pronunciation convenience)
GIS Developer
Connected2fiber
Linkedin : https://www.linkedin.com/in/yujie-wang-61a129a0/
Loading...