Deploy GitHub Pages
This commit is contained in:
parent
c34beca21f
commit
ea70134ed3
4 changed files with 86 additions and 77 deletions
|
|
@ -1186,19 +1186,19 @@
|
|||
<h2 id="summary">Summary<a class="headerlink" href="#summary" title="Permanent link"> ¶</a></h2>
|
||||
<p>Teach ingress-nginx about availability zones where endpoints are running in. This way ingress-nginx pod will do its best to proxy to zone-local endpoint.</p>
|
||||
<h2 id="motivation">Motivation<a class="headerlink" href="#motivation" title="Permanent link"> ¶</a></h2>
|
||||
<p>When users run their services across multiple availability zones they usually pay for egress traffic between zones. Providers such as GCP, Amazon EC charges money for that.
|
||||
ingress-nginx when picking an endpoint to route request to does not consider whether the endpoint is in different zone or the same one. That means it's at least equally likely
|
||||
that it will pick an endpoint from another zone and proxy the request to it. In this situation response from the endpoint to ingress-nginx pod is considered as
|
||||
inter zone traffic and costs money.</p>
|
||||
<p>At the time of this writing GCP charges $0.01 per GB of inter zone egress traffic according to https://cloud.google.com/compute/network-pricing.
|
||||
According to https://datapath.io/resources/blog/what-are-aws-data-transfer-costs-and-how-to-minimize-them/ Amazon also charges the same amount of money sa GCP for cross zone, egress traffic.</p>
|
||||
<p>When users run their services across multiple availability zones they usually pay for egress traffic between zones. Providers such as GCP, and Amazon EC2 usually charge extra for this feature.
|
||||
ingress-nginx when picking an endpoint to route request to does not consider whether the endpoint is in a different zone or the same one. That means it's at least equally likely
|
||||
that it will pick an endpoint from another zone and proxy the request to it. In this situation response from the endpoint to the ingress-nginx pod is considered
|
||||
inter-zone traffic and usually costs extra money.</p>
|
||||
<p>At the time of this writing, GCP charges $0.01 per GB of inter-zone egress traffic according to https://cloud.google.com/compute/network-pricing.
|
||||
According to https://datapath.io/resources/blog/what-are-aws-data-transfer-costs-and-how-to-minimize-them/ Amazon also charges the same amount of money as GCP for cross-zone, egress traffic.</p>
|
||||
<p>This can be a lot of money depending on once's traffic. By teaching ingress-nginx about zones we can eliminate or at least decrease this cost.</p>
|
||||
<p>Arguably inter-zone network latency should also be better than cross zone.</p>
|
||||
<p>Arguably inter-zone network latency should also be better than cross-zone.</p>
|
||||
<h3 id="goals">Goals<a class="headerlink" href="#goals" title="Permanent link"> ¶</a></h3>
|
||||
<ul>
|
||||
<li>Given a regional cluster running ingress-nginx, ingress-nginx should do best effort to pick zone-local endpoint when proxying</li>
|
||||
<li>Given a regional cluster running ingress-nginx, ingress-nginx should do best-effort to pick a zone-local endpoint when proxying</li>
|
||||
<li>This should not impact canary feature</li>
|
||||
<li>ingress-nginx should be able to operate successfully if there's no zonal endpoints</li>
|
||||
<li>ingress-nginx should be able to operate successfully if there are no zonal endpoints</li>
|
||||
</ul>
|
||||
<h3 id="non-goals">Non-Goals<a class="headerlink" href="#non-goals" title="Permanent link"> ¶</a></h3>
|
||||
<ul>
|
||||
|
|
@ -1206,36 +1206,45 @@ According to https://datapath.io/resources/blog/what-are-aws-data-transfer-costs
|
|||
<li>This feature will be relying on https://kubernetes.io/docs/reference/kubernetes-api/labels-annotations-taints/#failure-domainbetakubernetesiozone, it is not this KEP's goal to support other cases</li>
|
||||
</ul>
|
||||
<h2 id="proposal">Proposal<a class="headerlink" href="#proposal" title="Permanent link"> ¶</a></h2>
|
||||
<p>The idea here is to have controller part of ingress-nginx to (1) detect what zone its current pod is running in and (2) detect the zone for every endpoints it knows about.
|
||||
After that it will post that data as part of endpoints to Lua land. Then Lua balancer when picking an endpoint will try to pick zone-local endpoint first and
|
||||
if there is no zone-local endpoint then it will fallback to current behaviour.</p>
|
||||
<p>This feature at least in the beginning should be optional since it is going to make it harder to reason about the load balancing and not everyone might want that.</p>
|
||||
<p>The idea here is to have the controller part of ingress-nginx
|
||||
(1) detect what zone its current pod is running in and
|
||||
(2) detect the zone for every endpoint it knows about.
|
||||
After that, it will post that data as part of endpoints to Lua land.
|
||||
When picking an endpoint, the Lua balancer will try to pick zone-local endpoint first and
|
||||
if there is no zone-local endpoint then it will fall back to current behaviour.</p>
|
||||
<p>Initially, this feature should be optional since it is going to make it harder to reason about the load balancing and not everyone might want that.</p>
|
||||
<p><strong>How does controller know what zone it runs in?</strong>
|
||||
We can have the pod spec do pass node name using downward API as an environment variable.
|
||||
Then on start controller can get node details from the API based on node name. Once the node details is obtained
|
||||
we can extract the zone from <code>failure-domain.beta.kubernetes.io/zone</code> annotation. Then we can pass that value to Lua land through Nginx configuration
|
||||
We can have the pod spec pass the node name using downward API as an environment variable.
|
||||
Upon startup, the controller can get node details from the API based on the node name.
|
||||
Once the node details are obtained
|
||||
we can extract the zone from the <code>failure-domain.beta.kubernetes.io/zone</code> annotation.
|
||||
Then we can pass that value to Lua land through Nginx configuration
|
||||
when loading <code>lua_ingress.lua</code> module in <code>init_by_lua</code> phase.</p>
|
||||
<p><strong>How do we extract zones for endpoints?</strong>
|
||||
We can have the controller watch create and update events on nodes in the entire cluster and based on that keep the map of nodes to zones in the memory.
|
||||
And when we generate endpoints list, we can access node name using <code>.subsets.addresses[i].nodeName</code>
|
||||
and based on that fetch zone from the map in memory and store it as a field on the endpoint.
|
||||
<strong>This solution assumes <code>failure-domain.beta.kubernetes.io/zone</code></strong> annotation does not change until the end of node's life. Otherwise we have to
|
||||
<strong>This solution assumes <code>failure-domain.beta.kubernetes.io/zone</code></strong> annotation does not change until the end of the node's life. Otherwise, we have to
|
||||
watch update events as well on the nodes and that'll add even more overhead.</p>
|
||||
<p>Alternatively, we can get the list of nodes only when there's no node in the memory for given node name. This is probably a better solution
|
||||
because then we would avoid watching for API changes on node resources. We can eagrly fetch all the nodes and build node name to zone mapping on start.
|
||||
And from thereon sync it during endpoints building in the main event loop iff there's no entry exist for the node of an endpoint.
|
||||
<p>Alternatively, we can get the list of nodes only when there's no node in the memory for the given node name. This is probably a better solution
|
||||
because then we would avoid watching for API changes on node resources. We can eagerly fetch all the nodes and build node name to zone mapping on start.
|
||||
From there on, it will sync during endpoint building in the main event loop if there's no existing entry for the node of an endpoint.
|
||||
This means an extra API call in case cluster has expanded.</p>
|
||||
<p><strong>How do we make sure we do our best to choose zone-local endpoint?</strong>
|
||||
This will be done on Lua side. For every backend we will initialize two balancer instances: (1) with all endpoints
|
||||
(2) with all endpoints corresponding to current zone for the backend. Then given the request once we choose what backend
|
||||
needs to serve the request, we will first try to use zonal balancer for that backend. If zonal balancer does not exist (i.e there's no zonal endpoint)
|
||||
then we will use general balancer. In case of zonal outages we assume that readiness probe will fail and controller will
|
||||
see no endpoints for the backend and therefore we will use general balancer.</p>
|
||||
This will be done on the Lua side. For every backend, we will initialize two balancer instances:
|
||||
(1) with all endpoints
|
||||
(2) with all endpoints corresponding to the current zone for the backend.
|
||||
Then given the request once we choose what backend
|
||||
needs to serve the request, we will first try to use a zonal balancer for that backend.
|
||||
If a zonal balancer does not exist (i.e. there's no zonal endpoint)
|
||||
then we will use a general balancer.
|
||||
In case of zonal outages, we assume that the readiness probe will fail and the controller will
|
||||
see no endpoints for the backend and therefore we will use a general balancer.</p>
|
||||
<p>We can enable the feature using a configmap setting. Doing it this way makes it easier to rollback in case of a problem.</p>
|
||||
<h2 id="implementation-history">Implementation History<a class="headerlink" href="#implementation-history" title="Permanent link"> ¶</a></h2>
|
||||
<ul>
|
||||
<li>initial version of KEP is shipped</li>
|
||||
<li>proposal and implementation details is done</li>
|
||||
<li>proposal and implementation details are done</li>
|
||||
</ul>
|
||||
<h2 id="drawbacks-optional">Drawbacks [optional]<a class="headerlink" href="#drawbacks-optional" title="Permanent link"> ¶</a></h2>
|
||||
<p>More load on the Kubernetes API server.</p>
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue