基本
- 安装文档
https://cert-manager.io/docs/installation/helm/#installing-with-helm
- 卸载文档
https://cert-manager.io/docs/installation/helm/#uninstalling
- helm repo文档
https://artifacthub.io/packages/helm/cert-manager/cert-manager
主要对象
-
Issuer 命名空间级别的发行者,用户自建
-
ClusterIssuer 集群级别的发行者,用户自建
-
Certificate 证书对象,用户自建
-
CertificateRequests 对象请求对象,根据Certificate创建
-
Orders 订单,根据Certificate创建
-
Challenges 挑战校验所有权对象,根据Certificate创建。若证书请求成功,则会自动删除。
Certificates对象中的每一个dnsName都需要创建一个,并且Challenges对象是串行校验。如果网络不好,可能会耗时比较久。☠Challenges 是最有可能出错的对象,需要持续关注。例如 dns token 配置错误,或者网络错误,或者 webhook bug 导致无法正确请求 dns api。
安装
创建CRD
➜ kubectl apply -f https://github.com/jetstack/cert-manager/releases/download/v1.6.1/cert-manager.crds.yaml
添加Repo
➜ helm repo add jetstack https://charts.jetstack.io
安装cert-manager
➜ helm install cert-manager jetstack/cert-manager \
--namespace cert-manager \
--version v1.6.1 \
--create-namespace \
--set prometheus.enabled=true \
--set webhook.timeoutSeconds=4
基本自测
基本自测,不牵扯到真正的发行方以及第三方webhook。所以仅能保证cert-manager的基本正常。
➜ cat << EOF | tee test-selfsigned.yaml
apiVersion: v1
kind: Namespace
metadata:
name: cert-manager-test
---
apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
name: test-selfsigned
namespace: cert-manager-test
spec:
selfSigned: {}
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: selfsigned-cert
namespace: cert-manager-test
spec:
commonName: example.com
secretName: selfsigned-cert-tls
issuerRef:
name: test-selfsigned
EOF
➜ kubectl apply -f test-selfsigned.yaml
➜ kubectl get Issuers,ClusterIssuers,Certificates,CertificateRequests,Orders,Challenges -n cert-manager-test -o wide
NAME READY STATUS AGE
issuer.cert-manager.io/test-selfsigned True 49s
NAME READY SECRET ISSUER STATUS AGE
certificate.cert-manager.io/selfsigned-cert True selfsigned-cert-tls test-selfsigned Certificate is up to date and has not expired 49s
NAME APPROVED DENIED READY ISSUER REQUESTOR STATUS AGE
certificaterequest.cert-manager.io/selfsigned-cert-dwtdj True True test-selfsigned system:serviceaccount:cert-manager:cert-manager Certificate fetched from issuer successfully 49s
输出 Certificate is up to date and has not expired和Certificate fetched from issuer successfully 即表示正常。
如果没有输出,则检查各对象状态是否为True,出现False的一一排查
➜ kubectl describe issuer -n cert-manager-test
➜ kubectl describe certificaterequests -n cert-manager-test
证书申请
申请证书,需要选用发行方(issuer)和域名所有权校验(Challenges )方法。这里以ACME+DNS方式来配置。
证书申请对象流程:
order订单对象->
文档:
https://cert-manager.io/docs/configuration/acme/dns01/
https://cert-manager.io/docs/configuration/acme/dns01/#webhook
以aliyun dns为例。
安装dns webhook
通过webhook,让cert-manager可以支持aliyun dns域名校验
➜ helm repo add cert-manager-alidns-webhook https://devmachine-fr.github.io/cert-manager-alidns-webhook
➜ helm repo update
➜ helm install cert-manager-alidns-webhook cert-manager-alidns-webhook/alidns-webhook -n cert-manager --set groupName=zyh
😒groupName必须和后面ClusterIssuer对象里的groupName保持一致。https://github.com/DEVmachine-fr/cert-manager-alidns-webhook/issues/11
构建域名解析服务商token
以阿里云dns解析服务为例
➜ kubectl create secret generic alidns-secrets --from-literal="access-token=" --from-literal="secret-key=" -n cert-manager
确保ram ak拥有dns解析服务的读写权限
构建发行方ClusterIssuer
ClusterIssuer 对象用来指明你要用的证书发行方(集群级别)。以ACME发行方,并采用dns验证域名所有权。
https://cert-manager.io/docs/configuration/acme/
# 测试版本的发行方
➜ cat << EOF | tee clusterissuer-letsencrypt-staging-ali.yaml
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-staging-ali
spec:
acme:
email:
server: https://acme-staging-v02.api.letsencrypt.org/directory
privateKeySecretRef:
name: letsencrypt-staging-ali
solvers:
- dns01:
webhook:
config:
regionId: cn-beijing
accessTokenSecretRef:
key: access-token
name: alidns-secrets
secretKeySecretRef:
key: secret-key
name: alidns-secrets
groupName: zyh
solverName: alidns-solver
EOF
# 正式版本的发行方
➜ cat << EOF | tee clusterissuer-letsencrypt-prod-ali.yaml
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod-ali
spec:
acme:
email:
server: https://acme-v02.api.letsencrypt.org/directory
privateKeySecretRef:
name: letsencrypt-prod-ali
solvers:
- dns01:
webhook:
config:
regionId: cn-beijing
accessTokenSecretRef:
key: access-token
name: alidns-secrets
secretKeySecretRef:
key: secret-key
name: alidns-secrets
groupName: zyh
solverName: alidns-solver
EOF
acme.server 定义ACME服务端
- https://acme-staging-v02.api.letsencrypt.org/directory 测试服务器
- https://acme-v02.api.letsencrypt.org/directory 线上服务器
acme.email acme 需定义一个ACME账户邮箱,【自己定义】
acme.privateKeySecretRef.name 存储ACME账户私钥的secret对象,会自动创建
webhook.config 定义webhoob调用的dns token,【自己定义】
groupName 并不是证书包含的域名,指的是组织名,【自己定义】,需与webhook里保持一直。
solverName 指向webhook里定义的解析者,alidns-solver 貌似是写死的。
校验
确认 clusterissuer.cert-manager.io/letsencrypt-*-ali 是 True
➜ kubectl get Issuers,ClusterIssuers -n cert-manager
NAME READY AGE
issuer.cert-manager.io/cert-manager-alidns-webhook-ca True 20h
issuer.cert-manager.io/cert-manager-alidns-webhook-selfsign True 20h
NAME READY AGE
clusterissuer.cert-manager.io/letsencrypt-prod-ali True 137m
clusterissuer.cert-manager.io/letsencrypt-staging-ali True 137m
如果你发现它是False,则通过describe观察错误信息。在中国,可能出现的错误信息有:
Error initializing issuer: context deadline exceeded
这通常表明无法访问acme服务。这种情况下你需要耐心等待,直到注册成功。
构建证书Certificate
Certificate 用于申请并生成证书
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: example-com
namespace: dev-zyh
spec:
secretName: example-com-tls
issuerRef:
name: letsencrypt-ali
kind: ClusterIssuer
dnsNames:
- '*.example.com'
- example.com
- example.org
namespace 证书使用的区域
secretName 存储证书的secret名,会自动创建
issuerRef 指定发行方
dnsNames 证书包含的域名
再次校验
➜ kubectl get Certificates,CertificateRequests,Orders,Challenges -n dev-zyh
成功结果
➜ kubectl get Certificates,CertificateRequests,Orders,Challenges -n dev-zyh -o wide
NAME READY SECRET ISSUER STATUS AGE
certificate.cert-manager.io/zyh-cool True zyh-cool-tls letsencrypt-ali Certificate is up to date and has not expired 15m
NAME APPROVED DENIED READY ISSUER REQUESTOR STATUS AGE
certificaterequest.cert-manager.io/zyh-cool-jgj92 True True letsencrypt-ali system:serviceaccount:cert-manager:cert-manager Certificate fetched from issuer successfully 15m
NAME STATE ISSUER REASON AGE
order.acme.cert-manager.io/zyh-cool-jgj92-4122243896 valid letsencrypt-ali 15m
输出Certificate is up to date and has not expired
即申请成功。
额外
默认情况下,cert-manager-alidns-webhook 承诺自动更新证书。
而Certificates对象申请的证书时间默认为90天,提前30天更新。
你还可以通过spec.duration
设定证书有效时间,则更新时间spec.renewBefore
将默认设定为spec.duration
的2/3
。
不过如果用letsencrypt
,则没有必要设定spec.duration
,因为你只能申请最大90天。
证书使用
通过ingress直接申请调用
cert-manager 通过组件ingress-shim
监控Ingress
注解和spec.tls
的配置从而自动创建 Certificate 对象,并调用issuer
来申请证书。
前置条件:
- issuer 或者 clusterissuer 对象已创建
特点:
- 默认如果order彻底失败,则1小时之后将再次发起请求
已知并经过验证的问题:
-
如果
ingress注解
或者ingress.spec.tls
发生了修改,但ingress.spec.tls.hosts
完全没有改变,则不可直接apply。因为这将在同一时刻出现两份针对相同hosts发起的证书申请。这会导致其中一个申请永远不会成功并且重复发起,直至进入锁定期。https://github.com/jetstack/cert-manager/issues/1888
💥多个ingress里永远不要出现完全相同的
ingress.spec.tls.hosts
-
spec.tls.hosts
必须包含spec.rules.hosts
,否则 ingress 会将 Kubernetes Ingress Controller Fake Certificate 默认证书传递到 ingress-controller
一个例子:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
annotations:
# add an annotation indicating the issuer to use.
cert-manager.io/cluster-issuer: letsencrypt-ali
cert-manager.io/duration: 2160h # 90d
cert-manager.io/renew-before: 360h # 15d
name: myIngress
namespace: dev-zyh
spec:
ingressClassName: nginx
rules:
- host: zyh.cool
http:
paths:
- pathType: Prefix
path: /
backend:
service:
name: myservice
port:
number: 80
tls: # < placing a host in the TLS config will determine what ends up in the cert's subjectAltNames
- hosts:
- '*.zyh.cool'
- zyh.cool
secretName: zyh-cool-tls # < cert-manager will store the created certificate in this secret.
annotations.cert-manager.io/cluster-issuer: nameOfClusterIssuer
ingress-shim
监视此注解,从而启动集群级别的证书发行方
tls.secretName
指定存储证书的secret对象,它会自动创建
tls.hosts
指定证书绑定的域名
校验
kubectl get Certificates,CertificateRequests,Orders,Challenges -n dev-zyh
一份证书申请正常的日志
2022-01-20T14:31:04.709965618+08:00 I0120 06:31:04.709580 1 conditions.go:201] Setting lastTransitionTime for Certificate "zyh-cool-prod-tls" condition "Ready" to 2022-01-20 06:31:04.709572525 +0000 UTC m=+84737.724471939
2022-01-20T14:31:04.709985148+08:00 I0120 06:31:04.709732 1 trigger_controller.go:181] cert-manager/controller/certificates-trigger "msg"="Certificate must be re-issued" "key"="dev-cms/zyh-cool-prod-tls" "message"="Issuing certificate as Secret does not exist" "reason"="DoesNotExist"
2022-01-20T14:31:04.709989519+08:00 I0120 06:31:04.709741 1 conditions.go:201] Setting lastTransitionTime for Certificate "zyh-cool-prod-tls" condition "Issuing" to 2022-01-20 06:31:04.709739172 +0000 UTC m=+84737.724638545
2022-01-20T14:31:04.733560526+08:00 I0120 06:31:04.733475 1 controller.go:161] cert-manager/controller/certificates-trigger "msg"="re-queuing item due to optimistic locking on resource" "key"="dev-cms/zyh-cool-prod-tls" "error"="Operation cannot be fulfilled on certificates.cert-manager.io \"zyh-cool-prod-tls\": the object has been modified; please apply your changes to the latest version and try again"
2022-01-20T14:31:04.733613304+08:00 I0120 06:31:04.733525 1 trigger_controller.go:181] cert-manager/controller/certificates-trigger "msg"="Certificate must be re-issued" "key"="dev-cms/zyh-cool-prod-tls" "message"="Issuing certificate as Secret does not exist" "reason"="DoesNotExist"
2022-01-20T14:31:04.733620510+08:00 I0120 06:31:04.733538 1 conditions.go:201] Setting lastTransitionTime for Certificate "zyh-cool-prod-tls" condition "Issuing" to 2022-01-20 06:31:04.73353554 +0000 UTC m=+84737.748434908
2022-01-20T14:31:04.833325827+08:00 I0120 06:31:04.833054 1 conditions.go:261] Setting lastTransitionTime for CertificateRequest "zyh-cool-prod-tls-22n7r" condition "Approved" to 2022-01-20 06:31:04.833048731 +0000 UTC m=+84737.847948132
2022-01-20T14:31:04.863021821+08:00 I0120 06:31:04.862933 1 conditions.go:261] Setting lastTransitionTime for CertificateRequest "zyh-cool-prod-tls-22n7r" condition "Ready" to 2022-01-20 06:31:04.862927112 +0000 UTC m=+84737.877826465
2022-01-20T14:31:17.954783254+08:00 I0120 06:31:17.954632 1 dns.go:88] cert-manager/controller/challenges/Present "msg"="presenting DNS01 challenge for domain" "dnsName"="test.zyh.cool" "domain"="test.zyh.cool" "resource_kind"="Challenge" "resource_name"="zyh-cool-prod-tls-22n7r-2700220904-1691659519" "resource_namespace"="dev-cms" "resource_version"="v1" "type"="DNS-01"
2022-01-20T14:32:27.475094724+08:00 I0120 06:32:27.474930 1 acme.go:209] cert-manager/controller/certificaterequests-issuer-acme/sign "msg"="certificate issued" "related_resource_kind"="Order" "related_resource_name"="zyh-cool-prod-tls-22n7r-2700220904" "related_resource_namespace"="dev-cms" "related_resource_version"="v1" "resource_kind"="CertificateRequest" "resource_name"="zyh-cool-prod-tls-22n7r" "resource_namespace"="dev-cms" "resource_version"="v1"
2022-01-20T14:32:27.475161100+08:00 I0120 06:32:27.475001 1 conditions.go:250] Found status change for CertificateRequest "zyh-cool-prod-tls-22n7r" condition "Ready": "False" -> "True"; setting lastTransitionTime to 2022-01-20 06:32:27.474997969 +0000 UTC m=+84820.489897312
2022-01-20T14:32:27.537540719+08:00 I0120 06:32:27.537414 1 conditions.go:190] Found status change for Certificate "zyh-cool-prod-tls" condition "Ready": "False" -> "True"; setting lastTransitionTime to 2022-01-20 06:32:27.537409387 +0000 UTC m=+84820.552308752
2022-01-20T14:32:27.552649591+08:00 E0120 06:32:27.552582 1 controller.go:211] cert-manager/controller/challenges "msg"="challenge in work queue no longer exists" "error"="challenge.acme.cert-manager.io \"zyh-cool-prod-tls-22n7r-2700220904-1691659519\" not found"
2022-01-20T14:32:27.571542097+08:00 I0120 06:32:27.571149 1 controller.go:161] cert-manager/controller/certificates-readiness "msg"="re-queuing item due to optimistic locking on resource" "key"="dev-cms/zyh-cool-prod-tls" "error"="Operation cannot be fulfilled on certificates.cert-manager.io \"zyh-cool-prod-tls\": the object has been modified; please apply your changes to the latest version and try again"
2022-01-20T14:32:27.571579091+08:00 I0120 06:32:27.571390 1 conditions.go:190] Found status change for Certificate "zyh-cool-prod-tls" condition "Ready": "False" -> "True"; setting lastTransitionTime to 2022-01-20 06:32:27.571386394 +0000 UTC m=+84820.586285748
里面存在一些看起来是错误的日志,但其实并不是。
非ingress直接申请调用
前置条件:
- 证书已申请成功
ingress配置
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: myIngress
namespace: dev-zyh
spec:
ingressClassName: nginx
rules:
- host: zyh.cool
http:
paths:
- pathType: Prefix
path: /
backend:
service:
name: myservice
port:
number: 80
tls: # < placing a host in the TLS config will determine what ends up in the cert's subjectAltNames
- hosts:
- '*.zyh.cool'
- zyh.cool
secretName: zyh-cool-tls # < cert-manager will store the created certificate in this secret.
监控
待续
卸载
卸载cert-manager创建的资源
kubectl get Issuers,ClusterIssuers,Certificates,CertificateRequests,Orders,Challenges --all-namespaces
卸载cert-manager
helm --namespace cert-manager delete cert-manager
kubectl delete namespace cert-manager
删除CRD
helm list -n cert-manager -o yaml | grep app_version
kubectl delete -f https://github.com/jetstack/cert-manager/releases/download/vX.Y.Z/cert-manager.crds.yaml
vX.Y.Z需要自定义