[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: cloud-init failing to run user-data, about 1 in 10 times



Findings: with an ASG of 20 nodes, one of them failed to run the
user-data when running cloud-init v18.3-3

Entire user-data script for reference:

#!/bin/bash
echo "I was here" | tee -a /tmp/uds-logging

I was also able to run `cloud-init analyze show` on the failing (and
succeeding) nodes.  The failing nodes seem to skip entirely the
"modules-final" stage

Successful run:
-----------------------------------------------------------------------------------------------------------------------------------

$ sudo cloud-init analyze show
-- Boot Record 01 --
The total time elapsed since completing an event is printed after the
"@" character.
The time the event takes is printed after the "+" character.

Starting stage: single
Starting stage: init-local
|`->cache invalid in datasource: DataSourceEc2 @18647.94500s +05.06200s
|`->no local data found from DataSourceEc2Local @18653.00900s +00.03200s
Finished stage: (init-local) 05.14900 seconds

Starting stage: init-network
|`->no cache found @18653.56600s +00.00000s
|`->found network data from DataSourceEc2 @18653.57000s +00.12600s
|`->setting up datasource @18653.73800s +00.00000s
|`->reading and applying user-data @18653.74700s +00.00300s
|`->reading and applying vendor-data @18653.75000s +00.00000s
|`->activating datasource @18653.76600s +00.00100s
|`->config-migrator ran successfully @18653.82700s +00.00100s
|`->config-seed_random ran successfully @18653.82800s +00.00100s
|`->config-growpart ran successfully @18653.82900s +00.04800s
|`->config-bootcmd ran successfully @18653.87800s +00.00000s
|`->config-write-files ran successfully @18653.87800s +00.00100s
|`->config-growpart ran successfully @18653.87900s +00.02400s
|`->config-resizefs ran successfully @18653.90300s +00.02000s
|`->config-disk_setup ran successfully @18653.92400s +00.00100s
|`->config-mounts ran successfully @18653.92500s +00.00500s
|`->config-set_hostname ran successfully @18653.93000s +00.00100s
|`->config-update_hostname ran successfully @18653.93100s +00.00100s
|`->config-update_etc_hosts ran successfully @18653.93200s +00.00400s
|`->config-ca-certs ran successfully @18653.93600s +00.00100s
|`->config-rsyslog ran successfully @18653.93700s +00.00100s
|`->config-users-groups ran successfully @18653.93800s +00.00100s
|`->config-ssh ran successfully @18653.94300s +00.22000s
Finished stage: (init-network) 00.60700 seconds

Starting stage: modules-config
|`->config-emit_upstart ran successfully @18655.03900s +00.00100s
|`->config-ssh-import-id ran successfully @18655.04100s +00.00100s
|`->config-locale ran successfully @18655.04200s +00.00100s
|`->config-set-passwords ran successfully @18655.04400s +00.00100s
|`->config-grub-dpkg ran successfully @18655.04500s +00.23700s
|`->config-apt-pipelining ran successfully @18655.28300s +00.00300s
|`->config-apt-configure ran successfully @18655.28600s +00.09200s
|`->config-ntp ran successfully @18655.37900s +00.00200s
|`->config-timezone ran successfully @18655.38100s +00.00200s
|`->config-disable-ec2-metadata ran successfully @18655.38300s +00.00100s
|`->config-runcmd ran successfully @18655.38400s +00.00100s
|`->config-byobu ran successfully @18655.38600s +00.00100s
Finished stage: (modules-config) 00.39000 seconds

Starting stage: modules-final
|`->config-package-update-upgrade-install ran successfully
@18658.53600s +00.00200s
|`->config-fan ran successfully @18658.53800s +00.00100s
|`->config-puppet ran successfully @18658.54000s +00.00100s
|`->config-chef ran successfully @18658.54100s +00.00100s
|`->config-salt-minion ran successfully @18658.54200s +00.00100s
|`->config-mcollective ran successfully @18658.54300s +00.00100s
|`->config-rightscale_userdata ran successfully @18658.54500s +00.00400s
|`->config-scripts-vendor ran successfully @18658.55000s +00.00200s
|`->config-scripts-per-once previously ran @18658.55300s +00.00000s
|`->config-scripts-per-boot ran successfully @18658.55300s +00.00100s
|`->config-scripts-per-instance ran successfully @18658.55400s +00.00100s
|`->config-scripts-user ran successfully @18658.55600s +00.00800s
|`->config-ssh-authkey-fingerprints ran successfully @18658.56500s +00.00900s
|`->config-keys-to-console ran successfully @18658.57500s +00.08700s
|`->config-phone-home ran successfully @18658.66200s +00.00200s
|`->config-final-message ran successfully @18658.66500s +00.00700s
|`->config-power-state-change ran successfully @18658.67300s +00.00100s
Finished stage: (modules-final) 00.24300 seconds

Total Time: 6.38900 seconds

1 boot records analyzed

-----------------------------------------------------------------------------------------------------------------------------------
Failed run:
-----------------------------------------------------------------------------------------------------------------------------------
$ sudo cloud-init analyze show
-- Boot Record 01 --
The total time elapsed since completing an event is printed after the
"@" character.
The time the event takes is printed after the "+" character.

Starting stage: single
Starting stage: init-local
|`->cache invalid in datasource: DataSourceEc2 @18619.79200s +05.04800s
|`->no local data found from DataSourceEc2Local @18624.84200s +00.01800s
Finished stage: (init-local) 05.09500 seconds

Starting stage: init-network
|`->no cache found @18625.41900s +00.00000s
|`->found network data from DataSourceEc2 @18625.42300s +00.11500s
|`->setting up datasource @18625.58000s +00.00000s
|`->reading and applying user-data @18625.59000s +00.00300s
|`->reading and applying vendor-data @18625.59300s +00.00000s
|`->activating datasource @18625.60900s +00.00100s
|`->config-migrator ran successfully @18625.65900s +00.00100s
|`->config-seed_random ran successfully @18625.66000s +00.00100s
|`->config-growpart ran successfully @18625.66100s +00.03700s
|`->config-bootcmd ran successfully @18625.69800s +00.00000s
|`->config-write-files ran successfully @18625.69900s +00.00000s
|`->config-growpart ran successfully @18625.70000s +00.02400s
|`->config-resizefs ran successfully @18625.72400s +00.02000s
|`->config-disk_setup ran successfully @18625.74500s +00.00100s
|`->config-mounts ran successfully @18625.74700s +00.00200s
|`->config-set_hostname ran successfully @18625.75000s +00.00100s
|`->config-update_hostname ran successfully @18625.75100s +00.00100s
|`->config-update_etc_hosts ran successfully @18625.75300s +00.00400s
|`->config-ca-certs ran successfully @18625.75700s +00.00100s
|`->config-rsyslog ran successfully @18625.75800s +00.00000s
|`->config-users-groups ran successfully @18625.75900s +00.00100s
|`->config-ssh ran successfully @18625.76000s +00.20700s
Finished stage: (init-network) 00.55900 seconds

Starting stage: modules-config
|`->config-emit_upstart ran successfully @18626.96000s +00.00000s
|`->config-ssh-import-id ran successfully @18626.96100s +00.00200s
|`->config-locale ran successfully @18626.96300s +00.00200s
|`->config-set-passwords ran successfully @18626.96500s +00.00100s
|`->config-grub-dpkg ran successfully @18626.96600s +00.24200s
|`->config-apt-pipelining ran successfully @18627.20800s +00.00200s
|`->config-apt-configure ran successfully @18627.21200s +00.08000s
|`->config-ntp ran successfully @18627.29200s +00.00100s
|`->config-timezone ran successfully @18627.29400s +00.00100s
|`->config-disable-ec2-metadata ran successfully @18627.29500s +00.00100s
|`->config-runcmd ran successfully @18627.29600s +00.00100s
|`->config-byobu ran successfully @18627.29700s +00.00100s
Finished stage: (modules-config) 00.43600 seconds

Total Time: 6.09000 seconds

1 boot records analyzed
-----------------------------------------------------------------------------------------------------------------------------------

How can I triage this further?

Thank you for your time and assistance;
Jason
On Tue, Dec 4, 2018 at 11:12 AM Jason Price <japrice@gmail.com> wrote:
>
> On Tue, Dec 4, 2018 at 3:06 AM Thomas Goirand <zigo@debian.org> wrote:
> >
> > On 12/4/18 4:00 AM, Jason Price wrote:
> > > Thank you for any assistance, and please let me know how I can help.
> > Could you try with the latest cloud-init from Sid?
>
> Alright, I've begun testing with the new build.  I'll respond back if
> that helps, but it'll take some time to gain the confidence


Reply to: