<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
	<channel>
		<title>Posts on SDN-Warrior | Daniel Krieger</title>
		<link>https://sdn-warrior.org/posts/</link>
		<description>Recent content in Posts on SDN-Warrior | Daniel Krieger</description>
		<generator>Hugo -- gohugo.io</generator>
		<language>en-us</language>
		<copyright>Daniel Krieger</copyright>
		<lastBuildDate>Mon, 06 Apr 2026 23:00:00 +0200</lastBuildDate>
		<atom:link href="https://sdn-warrior.org/posts/index.xml" rel="self" type="application/rss+xml" />
		
		<item>
			<title>VCF9 - Extend Managment Domain Cluster</title>
			<link>https://sdn-warrior.org/posts/vcf9-extende-mgmt-domain/</link>
			<pubDate>Mon, 06 Apr 2026 23:00:00 +0200</pubDate>
			
			<guid>https://sdn-warrior.org/posts/vcf9-extende-mgmt-domain/</guid>
			<description><![CDATA[How do you actually expand a management domain in VCF9, and what do I need to keep in mind? A quick guide.]]></description>
			<content type="html"><![CDATA[<h2 id="introduction">Introduction</h2>
<p>Even though it’s a financially unwise decision right now, I decided to buy a third Minisforum MS-A2 because of the upcoming VCF 9.1 release and because I wanted to test other services like SSP in my lab. Luckily, I still had some RAM in my drawer from the good old days, when you didn’t have to sacrifice your firstborn or donate a kidney to get 128 GB of RAM. Yes, friends of sophisticated over-engineering, I am a blessed man.</p>
<p>But joking aside, this blog post is actually supposed to be about how I can expand my existing management domain, since I don’t want to reinstall everything—and to be honest, I’m not sure yet if it will all work, because my management domain currently consists of two clusters. The first cluster runs on two AMD MS-A2 servers with memory tiering and hosts my fleet, and the second cluster is a nested cluster on MS-01 servers (yes, the ones with the Intel CPUs). On top of that, I’m also running a nested instance of VCF9 that’s onboarded into my fleet. Of course, not everything can be running at the same time, since I don’t have enough RAM and processing power. So it’ll be interesting to see if the whole thing will work out somehow.</p>
<p>So if you&rsquo;re able to read this blog, it must have worked out somehow; if not, you&rsquo;ll never know.</p>
<h2 id="lets-get-ready-to-rumble">Let’s Get Ready to Rumble</h2>
<p>It all starts off quite innocently: first, unpack the new server, install the NVMEs—you know the drill. Then I install a fresh ESX 9.0.2 image and configure the network, DNS, NTP, memory tiering, and the Ryzen workarounds. If you want to read exactly what needs to be done, you can check out this <a href="https://sdn-warrior.org/posts/vcf9-ms-a2-special/">article</a>. I’ve written everything down in detail there.</p>
<figure><a href="01.png"><picture><source srcset="/vcf9-mgmt-domain/01_hu_ce28ccd2fa908264.png" type="image/png">
          <img
            src="/vcf9-mgmt-domain/01_hu_ce28ccd2fa908264.png"alt="JetKVM"width="1718"
            height="1236"/>
        </picture></a><figcaption><p>ESX installer - JetKVM (click to enlarge)</p></figcaption></figure>
<p>The goal should be to have a basic ESX9 host that can resolve DNS, has SSH enabled, has NTP working, has over 377 GB of RAM (memory tiering), and has only one network adapter connected. In other words, it should be exactly the same as if I were installing VCF9 from scratch.</p>
<p>Now that these preparations have been made, I can get started on the actual cluster expansion. To do this, I first need to check my network pool settings and, if necessary, adjust the IP ranges for the NFS and vMotion networks.</p>
<h2 id="network-pools">Network Pools</h2>
<p>The network pools are a bit hidden. In VCF 5.x, this was something you configured in the SDDC Manager. In VCF 9, this has now moved to vCenter under Global Inventory -&gt; Hosts -&gt; Network Pools.</p>
<figure><a href="02.png"><picture><source srcset="/vcf9-mgmt-domain/02_hu_d0583ef61538f87e.png" type="image/png">
          <img
            src="/vcf9-mgmt-domain/02_hu_d0583ef61538f87e.png"alt="Network Pools"width="1713"
            height="1008"/>
        </picture></a><figcaption><p>Network Pools (click to enlarge)</p></figcaption></figure>
<p>Fortunately, I was a bit more generous here and still have some IP addresses available.</p>

    <aside class="admonition warning">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-alert-circle">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="8" x2="12" y2="12"></line>
      <line x1="12" y1="16" x2="12.01" y2="16"></line>
   </svg></div><b>Warning</b>
        </div>
        <div class="admonition-content">If the IP range isn&rsquo;t sufficient, it can be expanded—but only if there are still free IP addresses available in the subnet. The subnet cannot be modified later.
Please always take growth into account when sizing your subnets.</div>
    </aside>
<p>The TEP IP pool is managed in NSX and, of course, must have two free IP addresses per host. In my case, it just barely works. The pool can be adjusted here as well.</p>
<figure><a href="03.png"><picture><source srcset="/vcf9-mgmt-domain/03_hu_26b92acaed9dd616.png" type="image/png">
          <img
            src="/vcf9-mgmt-domain/03_hu_26b92acaed9dd616.png"alt="NSX TEP Pool"width="1719"
            height="1145"/>
        </picture></a><figcaption><p>NSX TEP Pool (click to enlarge)</p></figcaption></figure>
<p>NSX is significantly more flexible than VCF in vCenter when it comes to the network pool.
Once everything has been checked, I can now add the host.</p>
<h2 id="host-onboarding">Host onboarding</h2>
<p>Onboarding is no longer done via the SDDC as it used to be, but must now also be done via the global inventory in vCenter.
To do this, go to <em><strong>Global Inventory List -&gt; Hosts -&gt; Unassigned Hosts -&gt; COMMISSION HOSTS</strong></em></p>
<p>After that, you&rsquo;re immediately greeted with a friendly checklist of everything that needs to be done:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">- Host for vSAN/vSAN ESA/vSAN Storage workload domain should be vSAN/vSAN ESA/vSAN Storage compliant and certified per the VMware Hardware Compatibility Guide. BIOS, HBA, SSD, HDD, etc. must match the VMware Hardware Compatibility Guide.
</span></span><span class="line"><span class="cl">- Host has the drivers and firmware versions specified in the VMware Compatibility Guide.
</span></span><span class="line"><span class="cl">- Host has ESXi installed on it. The host must be preinstalled with supported versions (9.0.2.0.25148076)
</span></span><span class="line"><span class="cl">- Host is configured with DNS server for forward and reverse lookup and FQDN.
</span></span><span class="line"><span class="cl">- Hostname should be same as the FQDN.
</span></span><span class="line"><span class="cl">- Management IP is configured to first NIC port.
</span></span><span class="line"><span class="cl">- Ensure that the host has a standard switch and the default uplinks with 10Gb speed are configured starting with traditional numbering (e.g., vmnic0) and increasing sequentially.
</span></span><span class="line"><span class="cl">- Host hardware health status is healthy without any errors.
</span></span><span class="line"><span class="cl">- All disk partitions on HDD / SSD are deleted.
</span></span><span class="line"><span class="cl">- Ensure required network pool is created and available before host commissioning.
</span></span><span class="line"><span class="cl">- Ensure hosts to be used for VSAN workload domain are associated with VSAN enabled network pool.
</span></span><span class="line"><span class="cl">- Ensure hosts to be used for NFS workload domain are associated with NFS enabled network pool.
</span></span><span class="line"><span class="cl">- Ensure hosts to be used for VMFS on FC workload domain are associated with NFS or VMOTION only enabled network pool.
</span></span><span class="line"><span class="cl">- Ensure hosts to be used for vVol FC workload domain are associated with NFS or VMOTION only enabled network pool.
</span></span><span class="line"><span class="cl">- Ensure hosts to be used for vVol NFS workload domain are associated with NFS and VMOTION only enabled network pool.
</span></span><span class="line"><span class="cl">- Ensure hosts to be used for vVol iSCSI workload domain are associated with iSCSI and VMOTION only enabled network pool.
</span></span><span class="line"><span class="cl">- For hosts with a DPU device, enable SR-IOV in the BIOS and in the vSphere Client (if required by your DPU vendor).
</span></span></code></pre></div><p>Of course, I’ve carefully checked all of this and can confirm it.</p>
<p>After specifying the correct network pool and entering the correct storage type, fqdn, username, and password, the validation process failed with a certificate error—but why? After all, I created a new self-signed certificate during installation.</p>
<figure><a href="04.png"><picture><source srcset="/vcf9-mgmt-domain/04_hu_7f1961bb4d7a220e.png" type="image/png">
          <img
            src="/vcf9-mgmt-domain/04_hu_7f1961bb4d7a220e.png"alt="Cert Error"width="1164"
            height="867"/>
        </picture></a><figcaption><p>Cert Error (click to enlarge)</p></figcaption></figure>
<p>The problem is—and this is something the pre-check doesn’t tell you—that once you’ve rolled out your own certificates in your domain, the host certificate must be from the same CA. So I have to create a CA request via the ESX GUI, submit it to my Microsoft CA, and implement it on my host. I love certificates—not.</p>
<p>If you&rsquo;ve never done this before, you can do it in the ESX GUI under <em><strong>Host -&gt; Manage -&gt; Security &amp; Users -&gt; Certificates</strong></em>.
After the certificate exchange, the validation process completes successfully. The ESX server does not need to be restarted.
Once the commissioning is complete, the host should now appear under “Unassigned Hosts.” That means half the work is already done.</p>
<figure><a href="05.png"><picture><source srcset="/vcf9-mgmt-domain/05_hu_4291125f62b8ad2c.png" type="image/png">
          <img
            src="/vcf9-mgmt-domain/05_hu_4291125f62b8ad2c.png"alt="Unassigned Host"width="1722"
            height="668"/>
        </picture></a><figcaption><p>Unassigned Hosts (click to enlarge)</p></figcaption></figure>
<h2 id="extend-cluster">Extend Cluster</h2>
<p>Now comes the fun part: actually expanding the cluster. To do this, <em><strong>right-click on the existing cluster -&gt; Add Host -&gt; Add Unassigned Hosts</strong></em>.
Here, too, the process differs from VCF 5.x, as this task is not performed in the SDDC.

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">In the current VCF 9 release, it would still be done via the SDDC, but this approach is eventually set to be phased out completely, and the recommended method is via vCenter.</div>
    </aside></p>
<p>The process itself is pretty straightforward, and the most important thing is that the uplink assignment on the distributed switch is correct. That’s actually the only potential source of error at this stage.</p>
<figure><a href="06.png"><picture><source srcset="/vcf9-mgmt-domain/06_hu_92a485f96d1b29fd.png" type="image/png">
          <img
            src="/vcf9-mgmt-domain/06_hu_92a485f96d1b29fd.png"alt="dVSwitch"width="1233"
            height="825"/>
        </picture></a><figcaption><p>dVSwitch (click to enlarge)</p></figcaption></figure>
<p>Here, too, there is a brief validation step after confirmation, and if no errors occur, the process should run fully automatically.
The process can be monitored in both SDDC and vCenter.
Personally, I find the view in the SDDC a bit clearer than the one in the vCenter Recent Tasks view. You can also check the progress of the host’s NSX configuration in NSX Manager.
After about 10 minutes, the whole thing was done and dusted, and my management domain had been successfully expanded to three nodes.</p>
<figure><a href="07.png"><picture><source srcset="/vcf9-mgmt-domain/07_hu_593d89c79d66dd6a.png" type="image/png">
          <img
            src="/vcf9-mgmt-domain/07_hu_593d89c79d66dd6a.png"alt="Cluster"width="1643"
            height="884"/>
        </picture></a><figcaption><p>Finished cluster(click to enlarge)</p></figcaption></figure>
<p>To perform a manual validation, I booted up a test VM connected to an NSX network and ran a quick connectivity check to make sure my TEP network was working properly. Unfortunately, with VCF 9.0.2 and the MS-A2, I can no longer see the TEP tunnel status displayed correctly in NSX. The status simply remains gray—unknown. However, this is purely a visual issue, and other 9.0.2 users with the same hardware are experiencing the same problem.</p>
<h2 id="conclusion">Conclusion</h2>
<p>What can I say? I expected the certificate error—I’ve fallen for that one before. But to make this article a bit more useful, I decided to highlight this error again. Generally speaking, though, I would have expected more problems, since one SDDC Manager (the one from my other nested domain) is unreachable, so I was pleasantly surprised. That’s because things like an inventory scan don’t run error-free unless all vCenters and SDDC Managers are accessible, and I had already envisioned having to get those two components online somehow to complete the expansion.</p>
<p>The most important thing when expanding is careful preparation and ensuring that all pre-checks are carried out. There’s a good reason why the checklist is included in the onboarding dialog. If you’ve followed all the steps, the expansion will indeed go smoothly.</p>
]]></content>
		</item>
		
		<item>
			<title>2026 VMUG Connect - Amsterdam</title>
			<link>https://sdn-warrior.org/posts/2026-vmug-amsterdam/</link>
			<pubDate>Tue, 24 Mar 2026 23:00:00 +0100</pubDate>
			
			<guid>https://sdn-warrior.org/posts/2026-vmug-amsterdam/</guid>
			<description><![CDATA[A personal summary of the first VMUG Connect in Europe.]]></description>
			<content type="html"><![CDATA[<h2 id="vmug-connect-amsterdam--between-tech-talks-and-travel-vibes">VMUG Connect Amsterdam – Between Tech Talks and Travel Vibes</h2>
<p>Wow, so here we are again. Another travel blog post and no tech content. Yes, unfortunately I’m a bit behind on my own &ldquo;roadmap&rdquo;, but right now I simply don’t have the personal time to write another tech article. I actually still have so much in the works and just can’t seem to get through it all. But that won’t stop me from sharing my impressions of VMUG Connect in Amsterdam here.</p>
<h2 id="day-0---day-of-travel">Day 0 - Day of travel</h2>
<p>My trip started early Monday morning with the Deutsche Bahn, traveling via Mannheim and Düsseldorf to Amsterdam Centraal—a surprisingly uneventful train ride. I was practically right on time, caught all my connections, and everything just went smoothly—crazy. To be honest, I’m not really used to that.</p>
<p>I bought a metro ticket directly online via the Metro app (without having to register—yeah, that’s how it can be done, Deutsche Bahn), since my hotel was a bit out of the way and I had to take the metro about 16 minutes to the RAI and back every morning/evening. Writing this makes me realize that I really need to get my travel expense report done.</p>
<p>Well, anyway, I checked into the hotel and then spent some time exploring Amsterdam. All in all, the weather wasn’t great that day, so I headed back to my hotel pretty quickly and relaxed a bit.</p>
<figure><a href="01.jpg"><picture><source srcset="/2026-amsterdam/01_hu_91950889935cf041.jpg" type="image/jpeg">
          <img
            src="/2026-amsterdam/01_hu_91950889935cf041.jpg"alt="Amsterdam"width="2016"
            height="1512"/>
        </picture></a><figcaption><p>I am Amsterdam (click to enlarge)</p></figcaption></figure>
<p>It wasn’t my first visit to Amsterdam, after all, and I wasn’t really in the mood for sightseeing anyway.
So you could say it was a completely unspectacular day of arrival.</p>
<h2 id="day-1---preconnect">Day 1 - PreConnect</h2>
<p>The first day started off quite leisurely with a nice breakfast at the hotel, and afterward I met up with my colleague Steffen Richter in front of the Rai to pick up our badges and check out the venue. The expo was still a work in progress at that point, and we ran into the first few familiar faces.
Speaking of faces, no trip would be complete without a happy face selfie. I know you didn&rsquo;t ask for it, but here it is.</p>
<figure><a href="02.jpg"><picture><source srcset="/2026-amsterdam/02_hu_2395d8beba90c1ae.jpg" type="image/jpeg">
          <img
            src="/2026-amsterdam/02_hu_2395d8beba90c1ae.jpg"alt="Happy Face"width="1482"
            height="1111"/>
        </picture></a><figcaption><p>Happy Face (click to enlarge)</p></figcaption></figure>
<p>But let’s get to the official part. The first session I attended was a so-called PreConnect Session <em><strong>VCF: Deployment, Automation &amp; Networking</strong></em>, presented by and featuring John Nicholson. It wasn’t a traditional session with a slide deck and such, but more like an open discussion. Unfortunately, there was an error in the VMUG app that listed me as a co-speaker, but I wasn’t informed beforehand that I had a second session. I spoke briefly with John Nicholson later, and he didn’t know either that we were supposed to have a session together. I was asked about it several times and can only say sorry, but something went wrong with the organization. For me, it remains Schrödinger’s speaker slot. You never really know whether I should have participated or not. :D</p>
<figure><a href="03.jpg"><picture><source srcset="/2026-amsterdam/03_hu_2565612525c80b3f.jpg" type="image/jpeg">
          <img
            src="/2026-amsterdam/03_hu_2565612525c80b3f.jpg"alt="Schrödinger’s speaker slot"width="1920"
            height="1440"/>
        </picture></a><figcaption><p>Schrödinger’s speaker slot (click to enlarge)</p></figcaption></figure>
<p>After that, I attended the PreConnect session <em><strong>Security: All Things Security</strong></em>. As before, this wasn’t a traditional session but a panel discussion, and what can I say—Chris McCain doesn’t need a microphone. Some people in the audience said he was “American loud,” and that wasn’t meant negatively, but rather as a compliment. The panel discussion was quite entertaining, even if it didn’t really offer anything new to me.</p>
<figure><a href="04.jpg"><picture><source srcset="/2026-amsterdam/04_hu_c0b8b37099645421.jpg" type="image/jpeg">
          <img
            src="/2026-amsterdam/04_hu_c0b8b37099645421.jpg"alt="Security: All Things Security"width="2016"
            height="1512"/>
        </picture></a><figcaption><p>Security: All Things Security (click to enlarge)</p></figcaption></figure>
<p>Unfortunately, I missed my colleague Maria Kmita’s session (sorry, I would have loved to see it, but you know I tend to get sidetracked).
These PreConnect sessions were something new for me, and I found them really entertaining. Above all, they were twice as long as regular sessions, and we had the chance to ask all sorts of questions—both the obvious and the unexpected.</p>
<p>After that, i headed over to the expo for some casual “networking”—and by “networking,” I actually mean drinking beer and shooting the breeze with friends, old buddies, and new faces. The beer was actually not too bad, by the way. Yeah, I’m still German at heart, and the fact that I didn’t complain is praise enough.</p>
<figure><a href="11.jpg"><picture><source srcset="/2026-amsterdam/11_hu_b1d0205087ad90d0.jpg" type="image/jpeg">
          <img
            src="/2026-amsterdam/11_hu_b1d0205087ad90d0.jpg"alt="Athideth Sananikone"width="960"
            height="1280"/>
        </picture></a><figcaption><p>Thanks Athideth Sananikone for the picture (click to enlarge)</p></figcaption></figure>
<h2 id="day-2---here-we-go">Day 2 - Here we go</h2>
<p>Today is actually the first real day of VMUG Connect. It kicks off with the general session at 9 a.m. Fresh and rested (okay, I’m lying), I showed up right on time just before 9 a.m. and listened intently to Duncan Epping as he talked about what’s new. But dude, what kind of session title is that? <em><strong>From Roadmap to Implementation: Key Innovations Redefining Storage and Cyber Resilience for VCF 9!</strong></em> Sounds like a tongue-twister for me. The general session was split into two parts, and Chris McCain also had a segment titled <em><strong>Shrink the Blast Radius: Private Cloud Segmentation Made Easy by Chris McCain</strong></em>. I have to say that DR isn’t really my area of expertise (sorry, Duncan), and the vDefend session didn’t have a ton of new stuff for me either, but Brad Tomkins did a wonderful job moderating the general keynote, and it was definitely entertaining.</p>
<figure><a href="05.jpg"><picture><source srcset="/2026-amsterdam/05_hu_e19dc58bc1967cfd.jpg" type="image/jpeg">
          <img
            src="/2026-amsterdam/05_hu_e19dc58bc1967cfd.jpg"alt="Brad Tomkins"width="2016"
            height="1512"/>
        </picture></a><figcaption><p>Brad Tomkins (click to enlarge)</p></figcaption></figure>
<p>After the opening session, I went straight to the <em><strong>Redefining the Private Cloud: The Next Evolution of VCF</strong></em> session without a break. The session offered some exciting insights into upcoming VCF versions. Among other things, it was mentioned that stateful services will soon be possible in NSX without edges. They also demonstrated what the VXLAN/EVPN integration will look like in version 9.1. We have some exciting new features coming our way, especially in NSX.</p>
<p>After that, I took some time to myself to mentally prepare for my VCAP Operations exam, which I was able to take for free thanks to VMUG Advantage. I passed, so I was able to relax and enjoy the rest of the day.</p>
<figure><picture><source srcset="/2026-amsterdam/06_hu_43d60f2597a400d0.png" type="image/png">
          <img
            src="/2026-amsterdam/06_hu_43d60f2597a400d0.png"alt="VCAP"width="600"
            height="600"/>
        </picture><figcaption><p>VCAP Operations</p></figcaption></figure>
<p>My next session was led by <a href="https://www.linkedin.com/in/giovanni-dominoni-65065678/">Giovanni Dominoni</a> and <a href="https://www.linkedin.com/in/amedeo-simone-luciano-28a9b84b/">Amedeo Simone Luciano</a> and had the promising title <em><strong>Zero-Impact Network Fabric Migration: From Cisco Nexus 9K to ACI with NSX Environments</strong></em>. It provided a practical demonstration of how to approach a network platform migration without any downtime, and it was truly inspiring to see how the two of them implemented this project. This led to some nice discussions after the session.</p>
<p>Since I ended up chatting with so many people again, I wasn’t able to attend any more sessions that day. But let’s be honest—talking to people is just as important as any session, especially when you work for a partner.</p>
<p>The welcome reception took place that evening. It was held at <em><strong>Strandzuid</strong></em>,  a cozy waterfront restaurant beautifully situated right next to the Rai Convention Center, from which you could take a pleasant walks around the lake. I would also like to take this opportunity to mention comdivision, which sponsored the entire welcome reception and treated us all to a wonderful evening.</p>




    <div class="image-gallery" id="gallery-a6ec200e">
                <a href="/2026-amsterdam/welcome%20reception.jpg" class="gallery-item" data-lightbox-group="gallery-a6ec200e" data-title="2026 Amsterdam Welcome Reception"><img src="/2026-amsterdam/welcome%20reception_hu_66b3ae7b6b5c5326.jpg" alt="/2026-amsterdam/welcome reception" width="300" height="300" loading="lazy">
                    <div class="gallery-item-caption">2026 Amsterdam Welcome Reception</div>
                </a>
                <a href="/2026-amsterdam/welcome%20reception2.jpg" class="gallery-item" data-lightbox-group="gallery-a6ec200e" data-title="2026 Amsterdam Welcome Reception2"><img src="/2026-amsterdam/welcome%20reception2_hu_8540d7fb2ba7e726.jpg" alt="/2026-amsterdam/welcome reception2" width="300" height="300" loading="lazy">
                    <div class="gallery-item-caption">2026 Amsterdam Welcome Reception2</div>
                </a>
                <a href="/2026-amsterdam/vBeer.jpg" class="gallery-item" data-lightbox-group="gallery-a6ec200e" data-title="2026 Amsterdam v Beer"><img src="/2026-amsterdam/vBeer_hu_b9f37600e9233047.jpg" alt="/2026-amsterdam/vBeer" width="300" height="300" loading="lazy">
                    <div class="gallery-item-caption">2026 Amsterdam v Beer</div>
                </a>
                <a href="/2026-amsterdam/more%20happy%20faces.jpg" class="gallery-item" data-lightbox-group="gallery-a6ec200e" data-title="2026 Amsterdam More Happy Faces"><img src="/2026-amsterdam/more%20happy%20faces_hu_a4bec7a0c2eadf7e.jpg" alt="/2026-amsterdam/more happy faces" width="300" height="300" loading="lazy">
                    <div class="gallery-item-caption">2026 Amsterdam More Happy Faces</div>
                </a></div>
<h2 id="day-3---or-the-day-my-legs-were-shaking">Day 3 - or the day my legs were shaking</h2>
<p>This day was a particular focus for me because I gave my first session at an international event—all in English, with a live demo, and on a spotty internet connection. Naturally, I was pretty nervous leading up to the day. Of course, this wasn’t the first session I’ve ever given, but I think I’ll always be nervous.</p>
<p>Once again, I talked about one of my favorite topics, which is VPCs in NSX, or rather VCF9. What can I say? I just love the feature and the possibilities it offers. I was also very surprised by the audience—for one thing, the room was almost completely full, and for another, at least 80% of the people in the session were using NSX.</p>
<figure><a href="12.jpg"><picture><source srcset="/2026-amsterdam/12_hu_169d248ddf8c5077.jpg" type="image/jpeg">
          <img
            src="/2026-amsterdam/12_hu_169d248ddf8c5077.jpg"alt="Stage Time"width="1462"
            height="1096"/>
        </picture></a><figcaption><p>Stage time (click to enlarge)</p></figcaption></figure>
<p>I wasn’t used to that at all. So I cut the general NSX section short and focused more (though not enough) on the topic of traffic flow. For anyone who’d like my slides, I’m making them available for download here. Unfortunately, due to space constraints, only the PDF version is available, but that should suffice.</p>
<figure><a href="VPCs%20VMUG%20Connect.pdf" target="_blank"><picture><source srcset="/2026-amsterdam/13_hu_687c23b7d9bb9015.png" type="image/png">
          <img
            src="/2026-amsterdam/13_hu_687c23b7d9bb9015.png"alt="Download"width="1847"
            height="1036"/>
        </picture></a><figcaption><p>VMUG Slidedeck (click to download)</p></figcaption></figure>
<p>After my session, I had a quick meal, and then, unfortunately, I had to head home again. Since you can always count on Deutsche Bahn, my connecting train in Mannheim was so late that I actually ended up getting home earlier because I was able to catch a train that was supposed to have already left. It’s quite a feat to get home faster thanks to a delay.</p>
<h2 id="conclusion">Conclusion</h2>
<p>Compared to the VMUG in <a href="https://sdn-warrior.org/posts/vmug-connect-stl-2025/">St. Louis</a>, I’d say the Amsterdam event had a better venue. The expo was also somehow better organized. To be fair, it’s worth noting that Connect in St. Louis was the first of its kind and certainly just a trial run. As always with these kinds of events, it’s great when the community comes together and you finally get to see all those people in person whom you otherwise only know from LinkedIn, the vExpert Discord, or local VMUG events. For me, that’s actually always the highlight of every event and, in my view, the very heart of VMUG. I also had great conversations with Duncan Epping, Chris McCain, and other VMware leaders, and of course plenty of chats with vExperts, VMware users, partners, and customers. In the end, that’s what makes these events special for me—not the sessions or any slides, but the direct interaction.</p>
<p>Many thanks to everyone I had the pleasure of sharing the event with. ❤️❤️❤️</p>
]]></content>
		</item>
		
		<item>
			<title>IPv6 with NSX</title>
			<link>https://sdn-warrior.org/posts/ipv6-nsx/</link>
			<pubDate>Sun, 01 Mar 2026 12:00:00 +0100</pubDate>
			
			<guid>https://sdn-warrior.org/posts/ipv6-nsx/</guid>
			<description><![CDATA[A look at IPv6 and the situation with NSX. In this blog, I navigate the valley of tears known as IPv6, demonstrate how it can be utilized with NSX, and explain some of the fundamentals.]]></description>
			<content type="html"><![CDATA[<h2 id="introduction">Introduction</h2>
<p>IPv6 has been around as a standard since 1998—so it feels like forever. I was 16 years old at the time. Since then, we&rsquo;ve been hearing that IPv4 addresses are running out and IPv6 is the solution. And although IPv6 is now 25 years old (do you feel as old as I do now?), it is still far from being standard. To be honest, in the last 25 years, I&rsquo;ve had exactly one customer who uses IPv6 in their data center, and that&rsquo;s only as a dual stack.</p>
<p>According to estimates by <a href="https://blog.cloudflare.com/ipv6-from-dns-pov/">Cloudflare from 2023</a>, global acceptance of IPv6 is around 35.9%. ChatGPT puts it at around 45%, but we are still a long way from widespread adoption. PS: My blog can be accessed with IPv6 native.</p>
<p>Nevertheless, I wanted to take a look at the status quo and set up a working IPv6 setup with NSX, but unfortunately it&rsquo;s not as easy as I had imagined.
Perhaps I should briefly mention the framework conditions. I have a consumer DSL connection from Deutsche Telekom. They generously provide me with a public IPv4 and a /56 IPv6 prefix. I already have my OPNsense appliance running in dual stack. I will only discuss this configuration to a limited extent, as it can be different for each operator.</p>
<p>Unfortunately, my IPv4 and IPv6 are dynamic because I have a normal private customer connection and not a business tariff. Thanks for that – not. Unfortunately, this also means that I have to come up with something else for the IPv6 prefixes for NSX. More on that later. I also have a Mikrotik router between OPNSense and the NSX T0 router.</p>
<figure><a href="01.png"><picture><source srcset="/ipv6-nsx/01_hu_b0a7e55fb2327430.png" type="image/png">
          <img
            src="/ipv6-nsx/01_hu_b0a7e55fb2327430.png"alt="Network"width="1323"
            height="839"/>
        </picture></a><figcaption><p>Network setup (click to enlarge)</p></figcaption></figure>
<p>The current setup is deliberately simple. My T0 router is active/standby at this point and I only use one uplink VLAN.
The reason for this is simple. The entire setup runs on a small <a href="https://sdn-warrior.org/posts/vcf9-dark-site-edge/">NUC Ultra 7</a> with only one LAN card. Any HA setup would be excessive here. But before we jump into the setup, we need to talk about a few IPv6 basics.</p>
<h2 id="gua--ula--link-local-and-other-ipv6-basics">GUA / ULA / Link Local and other IPv6 basics</h2>
<h3 id="gua">GUA</h3>
<p>We start with what is probably the simplest prefix range, the GUA or Global Unicast.
This prefix is publicly routable on the Internet and starts with <em><strong>2000::/3</strong></em>.</p>
<h3 id="ula">ULA</h3>
<p>The ULA prefix is the Unique Local Address range and is comparable to the private IP ranges of IPv4. These are not routed on the Internet and play an important role in my setup, as I unfortunately get assigned a dynamic prefix by my provider. ULA starts with <em><strong>fc00::/7</strong></em>
The first bit following the prefix indicates, if set, that the address is locally assigned. This splits the address block in two equally sized blocks, fc00::/8 and fd00::/8. fd00::/8 is the range that is actually used for private addressing. NSX uses an fc00::/8 network by default for hotplug networks between T0 and T1 routers. For this reason alone, you should avoid fc00::/8 and use fd00::/8 networks for private addressing, even though the risk of having duplicate IPs is very low.</p>
<h3 id="link-local">Link-Local</h3>
<p>LL is automatically available on every interface. Used for Neighbor Discovery, Router Advertisements, and often as BGP Next-Hop. Only valid in the local Layer 2.
LL starts with <em><strong>fe80::/10</strong></em>.</p>
<h3 id="multicast">Multicast</h3>
<p>Replaces broadcast in IPv6. Used for neighbor discovery, DAD, and router advertisements. Multicast can be recognized by the following prefix: <em><strong>ff00::/8</strong></em>.</p>
<h3 id="ipv4-mapped-ipv6">IPv4-mapped IPv6</h3>
<p>IPv4 addresses in IPv6 format, e.g., for MP-BGP, when IPv6 routes are transported over an IPv4 session. The prefix is <em><strong>::ffff:0:0/96</strong></em>
I had planned to use this for my NSX networks, but my OPNSense couldn&rsquo;t handle it. My IPv6 routes weren&rsquo;t in the routing table. I couldn&rsquo;t figure out whether it was due to OPNSense&rsquo;s FRR implementation or Mikrotik.</p>
<h3 id="prefix-delegation">Prefix Delegation</h3>
<p>My internet provider gives me a GUA /56 prefix via prefix delegation. Unfortunately, as already mentioned, this is dynamic. My OPNSense supports dynamic prefixes, and thanks to /56, I can create up to 256 /64 networks and assign these individual VLAN interfaces to the OPNSense. Unfortunately, NSX does not support this, which is why I ultimately had to resort to NAT66 and ULA.</p>
<h3 id="slaac">SLAAC</h3>
<p>SLAAC stands for Stateless Address Autoconfiguration and enables a host to automatically generate its IPv6 address from a /64 prefix announced by the router. To do this, the router sends router advertisements containing the prefix and the default gateway. The client then independently forms its complete IPv6 address without the need for a DHCP server to assign the address.</p>

    <aside class="admonition warning">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-alert-circle">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="8" x2="12" y2="12"></line>
      <line x1="12" y1="16" x2="12.01" y2="16"></line>
   </svg></div><b>Warning</b>
        </div>
        <div class="admonition-content">SLAAC itself only distributes the prefix and the default gateway. DNS servers are only provided if the router supports RDNSS (DNS via RA). Not all operating systems—especially minimalist distributions such as Alpine—reliably accept DNS information from router advertisements. As a result, IPv6 may work, but name resolution may not (this is a minor spoiler).</div>
    </aside>
<h3 id="dhcpv6">DHCPv6</h3>
<p>DHCPv6 provides IPv6 addresses and additional network information centrally via a server. It can either assign the complete address (stateful) or only provide additional information such as DNS (stateless). If DHCP runs in stateless mode, SLAAC is used for IP address assignment.</p>

    <aside class="admonition warning">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-alert-circle">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="8" x2="12" y2="12"></line>
      <line x1="12" y1="16" x2="12.01" y2="16"></line>
   </svg></div><b>Warning</b>
        </div>
        <div class="admonition-content">Not every device or operating system supports DHCPv6 stateful! Welcome to the brave new world of IPv6.</div>
    </aside>
<h3 id="nat66">NAT66</h3>
<p>NAT66 is the IPv6 variant of Network Address Translation, in which the source IPv6 address is replaced by another IPv6 address when leaving the network. It is often used when ULA addresses are used internally and the provider assigns a dynamic global prefix. NAT66 works statefully, which means that the router stores connection states for reverse translation. This keeps the internal address design stable, regardless of the external prefix. However, NAT66 contradicts the end-to-end principle of IPv6.</p>
<h3 id="nsx-supportet-ipv6-features">NSX Supportet IPv6 features</h3>
<ul>
<li>IPv6 and Dual-Stack Overlay Segments</li>
<li>IPv6 Distributed Routing (Tier-0 / Tier-1)</li>
<li>IPv6 BGP (MP-BGP)</li>
<li>IPv6 Distributed Firewall (L3/L4)</li>
<li>IPv6 Gateway Firewall</li>
<li>IPv6 Load Balancer VIPs</li>
<li>NAT66 support</li>
<li>DHCPv6 (Stateful &amp; Stateless)</li>
<li>SLAAC support</li>
</ul>

    <aside class="admonition warning">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-alert-circle">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="8" x2="12" y2="12"></line>
      <line x1="12" y1="16" x2="12.01" y2="16"></line>
   </svg></div><b>Warning</b>
        </div>
        <div class="admonition-content"><p>Unsupported IPv6 features are:</p>
<ul>
<li>EVPN</li>
<li>Multicast Routing</li>
<li>NSX Federation</li>
<li>VPCs</li>
</ul></div>
    </aside>
<h3 id="nsx-dad-profile">NSX DAD-Profile</h3>
<p>I could make a DAD joke now, but&hellip;</p>
<p><em><strong>Router:</strong></em> &ldquo;Doctor, it hurts when IP.&rdquo;
<em><strong>Doctor:</strong></em> &ldquo;Then stop using IPv6.&rdquo;</p>
<p>Okay, let&rsquo;s not go there, that&rsquo;s awful&hellip;</p>
<p>Duplicate Address Detection (DAD) in NSX is an IPv6 mechanism that ensures that an IPv6 address is unique within a segment before it is actively used.
NSX uses the standardized Neighbor Discovery Protocol to check whether an address is already being used by another VM, thereby preventing address conflicts.</p>
<h3 id="nsx-nd-profile">NSX ND-Profile</h3>
<p>An ND (Neighbor Discovery) profile in NSX defines parameters such as router advertisement DHCPv6 mode (stateful/stateless) and SLAAC behavior for a segment.
The ND profile controls how IPv6 addresses are assigned and which default gateway information is distributed to the workloads.</p>
<h3 id="that-was-quite-a-lot">That was quite a lot</h3>
<p>Now that we&rsquo;ve refreshed the basics, let&rsquo;s move on to the setup.</p>
<h2 id="configuring-the-opnsense-and-mikrotik-router">Configuring the OPNSense and Mikrotik router</h2>
<p>I decided to set up my routing dynamically with BGP and fell into a few traps.
Since I can&rsquo;t do much with my 4,722,366,482,869,645,213,696 public IPv6 addresses (that&rsquo;s an unimaginable 4.7 sextillion IP addresses)
because they are not permanent, I have decided on a ULA setup and will then use NAT66.
I have decided to use my ULA networks from the fd11:22:33::/56 prefix range. For one thing, it is still completely unused and relatively easy to read. My DNS servers have long been IPv6-capable in my network and have their own ULA IPs, which are somewhat arbitrary, however. Historically grown, as we say in Germany.
To keep everything clear, here is a short table:</p>
<table>
  <thead>
      <tr>
          <th>Purpose / Connection</th>
          <th>Prefix</th>
          <th>Description</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>ULA Core Aggregate</td>
          <td>fd11:22:33::/56</td>
          <td>Main internal IPv6 prefix</td>
      </tr>
      <tr>
          <td>OPNsense ↔ MikroTik Transit</td>
          <td>fd11:22:33:ff02::/64</td>
          <td>transit network</td>
      </tr>
      <tr>
          <td>MikroTik ↔ NSX T0 Transit</td>
          <td>fd11:22:33:ff01::/64</td>
          <td>transit network</td>
      </tr>
      <tr>
          <td>NSX Segment 1</td>
          <td>fd11:22:33:ff10::/64</td>
          <td>Workload</td>
      </tr>
      <tr>
          <td>NSX Segment 2</td>
          <td>fd11:22:33:ff11::/64</td>
          <td>Workload</td>
      </tr>
      <tr>
          <td>NSX Hotplug Network</td>
          <td>fc00::/8</td>
          <td>T0 - T1 network</td>
      </tr>
      <tr>
          <td>Public IP Prefix</td>
          <td>2003:c4:XXXX:b100::/56</td>
          <td>Telekom Internet</td>
      </tr>
  </tbody>
</table>
<p>The target setup is shown as a graphic.</p>
<figure><a href="02.png"><picture><source srcset="/ipv6-nsx/02_hu_93c2e4dae02d1a99.png" type="image/png">
          <img
            src="/ipv6-nsx/02_hu_93c2e4dae02d1a99.png"alt="Network IPs"width="1414"
            height="839"/>
        </picture></a><figcaption><p>Network setup - IPs (click to enlarge)</p></figcaption></figure>
<p>First, I configure my static IPv6 on my OPNSense on my BGP interface. It is important to select the IPv6 configuration type. This must be set to Static IPv6 and not Track Interface. With Track Interface, I would take a /64 prefix from my GUA IP address range, which unfortunately is not static.</p>
<figure><a href="03.png"><picture><source srcset="/ipv6-nsx/03_hu_647e726cfed8b751.png" type="image/png">
          <img
            src="/ipv6-nsx/03_hu_647e726cfed8b751.png"alt="OPNSense Config"width="1214"
            height="1120"/>
        </picture></a><figcaption><p>OPNSense config (click to enlarge)</p></figcaption></figure>
<p>It is also important that I assign the IP address cleanly with /64, otherwise the network cannot be used for SLAAC, and SLAAC is what we want to use on the other end, as all devices and operating systems that are dual-stack capable can do this.</p>

    <aside class="admonition warning">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-alert-circle">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="8" x2="12" y2="12"></line>
      <line x1="12" y1="16" x2="12.01" y2="16"></line>
   </svg></div><b>Warning</b>
        </div>
        <div class="admonition-content">Explicit IPv6 firewall rules must be written. In addition, IPv4 may be preferred in OPNSense. This can be deactivated under System -&gt; Settings -&gt; General -&gt; Prefer IPv4 over IPv6.</div>
    </aside>
<p>Next, IPv6 must be enabled on the Mikrotik. This requires a reboot and can be configured under IPv6 -&gt; Settings -&gt; Disable IPv6. By default, the checkbox is selected and must be deselected. One might almost think that none of the vendors are really interested in IPv6.</p>
<p>But let&rsquo;s get to the first problem I had. My lab already had a BGP configuration between the lab router and OPNSense via IPv4. OPNSense Business Edition does not properly support IPv6 route exchange via IPv4.
I saw my IPv6 routes in the FRR plugin, but not in the routing table of the OPNSense.
<figure><a href="04.png"><picture><source srcset="/ipv6-nsx/04_hu_b02ce72df2b325c8.png" type="image/png">
          <img
            src="/ipv6-nsx/04_hu_b02ce72df2b325c8.png"alt="OPNSense IPv6 Routing"width="1412"
            height="574"/>
        </picture></a><figcaption><p>OPNSense Routing (click to enlarge)</p></figcaption></figure></p>
<p>Conversely, it doesn&rsquo;t work either, so I couldn&rsquo;t get this to work. OPNSense simply didn&rsquo;t want to learn IPv6 routes over IPv4.
After an hour of troubleshooting, I gave up and configured a normal IPv6 and IPv4 BGP session.<br>
The settings are relatively straightforward. I am listing my settings here as an example. However, there is nothing special about them.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl"> ;;; OPNSenseIPv6
</span></span><span class="line"><span class="cl">     name=&#34;OPNSenseIPv6&#34; instance=OPNSense 
</span></span><span class="line"><span class="cl">     remote.address=fd11:22:33:ff02::1/128 .as=65101 
</span></span><span class="line"><span class="cl">     local.default-address=fd11:22:33:ff02::2 .role=ebgp 
</span></span><span class="line"><span class="cl">     connect=yes listen=yes routing-table=main as=65102 nexthop-choice=default hold-time=3m 
</span></span><span class="line"><span class="cl">     keepalive-time=1m afi=ipv6 use-bfd=yes 
</span></span><span class="line"><span class="cl">     output.redistribute=connected,bgp 
</span></span></code></pre></div><p>Here are the OPNSense settings, but they are pretty standard. Thanks to my new Hugo Gallery shortcode, it&rsquo;s now easier to view them.</p>




    <div class="image-gallery" id="gallery-d4b41362">
                <a href="/ipv6-nsx/opnsense-routing.png" class="gallery-item" data-lightbox-group="gallery-d4b41362" data-title="Ipv6 Nsx Opnsense Routing"><img src="/ipv6-nsx/opnsense-routing_hu_dc1705084a688542.png" alt="/ipv6-nsx/opnsense-routing" width="300" height="300" loading="lazy">
                    <div class="gallery-item-caption">Ipv6 Nsx Opnsense Routing</div>
                </a>
                <a href="/ipv6-nsx/opnsense-neigbor.png" class="gallery-item" data-lightbox-group="gallery-d4b41362" data-title="Ipv6 Nsx Opnsense Neigbor"><img src="/ipv6-nsx/opnsense-neigbor_hu_d23fc6e3ac7aaaa4.png" alt="/ipv6-nsx/opnsense-neigbor" width="300" height="300" loading="lazy">
                    <div class="gallery-item-caption">Ipv6 Nsx Opnsense Neigbor</div>
                </a></div>
<p>Now that that&rsquo;s done, I can finally get started on the NSX setup.</p>
<h2 id="nsx-ipv6-setup---it-just-works">NSX IPv6 setup - it just works?</h2>
<p>Of course not! Anyone who thought it would all be easy is mistaken. What I didn&rsquo;t really know—but what a quick glance at the documentation revealed—is that NSX has IPv6 forwarding disabled by default. Apparently, no one really wants to use IPv6.</p>
<figure><a href="05.png"><picture><source srcset="/ipv6-nsx/05_hu_a421709109d75c7c.png" type="image/png">
          <img
            src="/ipv6-nsx/05_hu_a421709109d75c7c.png"alt="NSX Global Network Config"width="1392"
            height="1007"/>
        </picture></a><figcaption><p>NSX Global Network Config (click to enlarge)</p></figcaption></figure>
<p>So I quickly jumped into the NSX Global Network config and enabled IPv6 L3 forwarding.
For IPv6 to work, the T0 router must receive IPv6 addresses in addition to IPv4 addresses.</p>
<figure><a href="06.png"><picture><source srcset="/ipv6-nsx/06_hu_b0d1063f93242ee8.png" type="image/png">
          <img
            src="/ipv6-nsx/06_hu_b0d1063f93242ee8.png"alt="NSX Edge Network Config"width="1442"
            height="1050"/>
        </picture></a><figcaption><p>NSX Edge Network Config (click to enlarge)</p></figcaption></figure>
<p>Next, I configure my IPv6 routing, and this is where RFC 5549 comes into play. To minimize the number of BGP sessions and IPv4 addresses, you can exchange both IPv4 and IPv6 routes over a single IPv6 BGP session.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">Support for encoding and processing an IPv4 route with an IPv6 next hop is negotiated as part of the capability exchange in the BGP OPEN message. If both sides of a peering session support the capability, IPv4 routes are advertised with an IPv6 next hop. Multi-protocol BGP (MP-BGP) is used to advertise the Network Layer Reachability Information of an IPv4 address family using the next hop of an IPv6 address family.</div>
    </aside>
<p>Mikrotik now also supports RFC 5549. The Mikrotik configuration is very similar to the OPNSense configuration. The main difference is that both IPv4 and IPv6 address families are enabled in one session.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">;;; BGP_VCF09-E01-Edge01
</span></span><span class="line"><span class="cl">     name=&#34;VCF09-E01-Edge1-01&#34; instance=VCF09-E01-Edge1-U1 
</span></span><span class="line"><span class="cl">     remote.address=fd11:22:33:ff01::2/128 .as=65004 
</span></span><span class="line"><span class="cl">     local.default-address=fd11:22:33:ff01::1 .role=ebgp 
</span></span><span class="line"><span class="cl">     connect=yes listen=yes routing-table=main as=65102 hold-time=3m keepalive-time=1m afi=ip,ipv6 
</span></span><span class="line"><span class="cl">     use-bfd=yes 
</span></span><span class="line"><span class="cl">     output.redistribute=connected,bgp .default-originate=if-installed 
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">;;; BGP_VCF09-E01-Edge02
</span></span><span class="line"><span class="cl">     name=&#34;VCF09-E01-Edge2-01&#34; instance=VCF09-E01-Edge1-U1 
</span></span><span class="line"><span class="cl">     remote.address=fd11:22:33:ff01::3/128 .as=65004 
</span></span><span class="line"><span class="cl">     local.default-address=fd11:22:33:ff01::1 .role=ebgp 
</span></span><span class="line"><span class="cl">     connect=yes listen=yes routing-table=main as=65102 hold-time=3m keepalive-time=1m afi=ip,ipv6 
</span></span><span class="line"><span class="cl">     use-bfd=yes 
</span></span><span class="line"><span class="cl">     output.redistribute=connected,bgp .default-originate=if-installed 
</span></span></code></pre></div><p>Next, the routing must be created on the T0 router. It is important to note that an IPv6 session must be created and an IPv6 route filter must be added to the default IPv4 route filter. Otherwise, the routing will not work. The rest of the configuration is standard. I always use BFD in my setups for fast network convergence, but it is not a must.</p>




    <div class="image-gallery" id="gallery-c1310a94">
                <a href="/ipv6-nsx/edge%20neigbor.png" class="gallery-item" data-lightbox-group="gallery-c1310a94" data-title="Ipv6 Nsx Edge Neigbor"><img src="/ipv6-nsx/edge%20neigbor_hu_29e8a308f17b8715.png" alt="/ipv6-nsx/edge neigbor" width="300" height="300" loading="lazy">
                    <div class="gallery-item-caption">Ipv6 Nsx Edge Neigbor</div>
                </a>
                <a href="/ipv6-nsx/edge%20routefilter.png" class="gallery-item" data-lightbox-group="gallery-c1310a94" data-title="Ipv6 Nsx Edge Routefilter"><img src="/ipv6-nsx/edge%20routefilter_hu_dc3ed2c2f09fd42f.png" alt="/ipv6-nsx/edge routefilter" width="300" height="300" loading="lazy">
                    <div class="gallery-item-caption">Ipv6 Nsx Edge Routefilter</div>
                </a></div>
<p>BGP peering should now be up and running. The IPv4 routing table on the Mikrotik looks a little strange. The immediate gateway shows the link local IP address of the NSX Edge Node. The IPv4 address is only how RouteOS represents IPv4 over IPv6 and is not actually used.</p>




    <div class="image-gallery" id="gallery-3629e91e">
                <a href="/ipv6-nsx/ipv4-over-ipv6.png" class="gallery-item" data-lightbox-group="gallery-3629e91e" data-title="Ipv6 Nsx Ipv4 Over Ipv6"><img src="/ipv6-nsx/ipv4-over-ipv6_hu_498dd5864979a5ee.png" alt="/ipv6-nsx/ipv4-over-ipv6" width="300" height="300" loading="lazy">
                    <div class="gallery-item-caption">Ipv6 Nsx Ipv4 Over Ipv6</div>
                </a>
                <a href="/ipv6-nsx/ipv6.png" class="gallery-item" data-lightbox-group="gallery-3629e91e" data-title="Ipv6 Nsx Ipv6"><img src="/ipv6-nsx/ipv6_hu_8a4d2348afbd97f2.png" alt="/ipv6-nsx/ipv6" width="300" height="300" loading="lazy">
                    <div class="gallery-item-caption">Ipv6 Nsx Ipv6</div>
                </a></div>

    <aside class="admonition warning">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-alert-circle">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="8" x2="12" y2="12"></line>
      <line x1="12" y1="16" x2="12.01" y2="16"></line>
   </svg></div><b>Warning</b>
        </div>
        <div class="admonition-content">Even if the IPv6 session is used for route exchange, routing to IPv4 networks only works if the EdgeVMs also have IPv4 addresses on the uplink interfaces.
If these configurations are missing, routes for the IPv4 networks exist but cannot be reached.</div>
    </aside>
<p>In the traceroute of the Mikrotik switch, you can also clearly see that for IPv4 networks, the next hop is actually the IPv4 address of the active EdgeVM.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">[admin@router.lab.home] &gt; tool/traceroute 10.10.0.1
</span></span><span class="line"><span class="cl">Columns: ADDRESS, LOSS, SENT, LAST, AVG, BEST, WORST, STD-DEV
</span></span><span class="line"><span class="cl">#  ADDRESS      LOSS  SENT  LAST   AVG  BEST  WORST  STD-DEV
</span></span><span class="line"><span class="cl">0  10.28.25.12  0%       3  0.6ms  0.5  0.4   0.6    0.1    
</span></span><span class="line"><span class="cl">1  10.10.0.1    0%       3  0.3ms  0.3  0.2   0.4    0.1  
</span></span></code></pre></div><h2 id="ipv6-segments-nd-and-dad">IPv6 segments, ND and DAD&hellip;</h2>
<p>Now that the routing part is done, it would have been much faster if I had used static routing, but that would also have been much more boring. Now I can finally create segments. And yes, I know I already created some for my tests, because without segments, there are no networks and therefore no routing test. But just imagine if I hadn&rsquo;t done that yet.</p>
<p>In my setup, I decided to build a T1 router for my IPv6 segments and a T1 router for my IPv4 networks, but why did I do that? Firstly, for clarity, and secondly, because of the ND profile, which is still needed. The ND profiles can be defined either globally at the T0 level or more specifically at the T1 level.
NSX supports 5 modes for ND:</p>
<ul>
<li>Disabled - Router advertisement messages are disabled.</li>
<li>SLAAC with DNS Through RA - The address and DNS information is generated with the router advertisement message.</li>
<li>SLAAC with DNS Through DHCP - The address is generated with the router advertisement message and the DNS information is generated by the DHCP server.</li>
<li>DHCP with Address and DNS through DHCP - The address and DNS information is generated by the DHCP server.</li>
<li>SLAAC with Address and DNS through DHCP - The address and DNS information is generated by the DHCP server. This option is only supported by NSX Edge and not by ESX hosts.</li>
</ul>
<p>The standard is SLAAC with DNS Through RA, but as I already mentioned, problems can arise with some operating systems or devices. Specifically, some devices or operating systems cannot use Route advertisment for DNS. Alpine Linux is one such case. I do get a valid IPv6 and can ping my DNS server with it, but name resolution does not work, and without that, the internet is not particularly useful. The DNS servers were, of course, configured correctly in the ND profile.</p>
<p>That&rsquo;s why I create an ND profile with SLAAC with DNS Through DHCP. This has the advantage that SLAAC and RA can generally be used, but DHCPv6 stateless is also used to explicitly assign DNS. The profiles can be found under Networking -&gt; Networking Profiles -&gt; Select profile type -&gt; ND Profile. In the new profile, I only change the mode; otherwise, all settings remain at default. Under Networking -&gt; Tier-1 Gateways -&gt; Additional Settings -&gt; ND Profile, I then select my new profile for my IPv6 T1 router.</p>
<p>All that&rsquo;s missing is our DAD&hellip;</p>
<p>Why did the boy get fired from his keyboard factory job? - Because he was not doing enough shifts.</p>
<p>Again? Really? OK, let&rsquo;s keep it short: the default DAD profile from NSX is sufficient for my purposes and can be assigned to both T1 and T0 in virtually the same way as the ND profile. And now I&rsquo;m never going to talk about DAD ag&hellip;Why don’t programmers like nature? - Too many bugs.</p>
<p>Okay, continuing in context, I want to finish the article, even though I&rsquo;ve just spent 30 minutes scrolling through Reddit and reading dad jokes.</p>
<p>Then I create an overlay segment connected to my IPv6 T1 and configure the following IPv6 gateway CIDR: fd11:22:33:ff10::1/64. I also create a segment DHCP server in which I specify the ULA IPs of my DNS servers in the lab and, of course, a DHCP server IP address from the segment. In my case, I simply take the next free IP fd11:22:33:ff10::2/64.</p>




    <div class="image-gallery" id="gallery-92172efd">
                <a href="/ipv6-nsx/dhcp-config.png" class="gallery-item" data-lightbox-group="gallery-92172efd" data-title="Ipv6 Nsx DHCP Config"><img src="/ipv6-nsx/dhcp-config_hu_143a80ba7fcf45f9.png" alt="/ipv6-nsx/dhcp-config" width="300" height="300" loading="lazy">
                    <div class="gallery-item-caption">Ipv6 Nsx DHCP Config</div>
                </a>
                <a href="/ipv6-nsx/dhcp-profile.png" class="gallery-item" data-lightbox-group="gallery-92172efd" data-title="Ipv6 Nsx DHCP Profile"><img src="/ipv6-nsx/dhcp-profile_hu_1110274d5a161060.png" alt="/ipv6-nsx/dhcp-profile" width="300" height="300" loading="lazy">
                    <div class="gallery-item-caption">Ipv6 Nsx DHCP Profile</div>
                </a></div>
<p>Next, I create a VM from my Alpine template and connect it to the new segment, power up the VM, and am disappointed.</p>
<h2 id="are-we-there-yet">Are we there yet?</h2>
<p>In order for my Alpine VM to get an IPv6 address at all, I have to edit the file /etc/network/interfaces and add iface eth0 inet6 auto.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-shell" data-lang="shell"><span class="line"><span class="cl">auto lo
</span></span><span class="line"><span class="cl">iface lo inet loopback
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">auto eth0
</span></span><span class="line"><span class="cl"><span class="c1">#iface eth0 inet dhcp</span>
</span></span><span class="line"><span class="cl">iface eth0 inet6 auto
</span></span></code></pre></div><p>One thing is already working after reboot: my VM is getting a ULA IP address via SLAAC and can also ping my DNS server.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">inet6 fd11:22:33:ff10:2b17:367c:2ddc:2925/64 scope global dynamic noprefixroute flags 100 
</span></span><span class="line"><span class="cl">  valid_lft 2591977sec preferred_lft 604777sec
</span></span><span class="line"><span class="cl">inet6 fe80::250:56ff:fe86:9e5/64 scope link 
</span></span><span class="line"><span class="cl">  valid_lft forever preferred_lft forever
</span></span></code></pre></div><p>Ping test:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">template:~# ping6 fd::101
</span></span><span class="line"><span class="cl">PING fd::101 (fd::101): 56 data bytes
</span></span><span class="line"><span class="cl">64 bytes from fd::101: seq=0 ttl=60 time=1.157 ms
</span></span><span class="line"><span class="cl">64 bytes from fd::101: seq=1 ttl=60 time=1.190 ms
</span></span><span class="line"><span class="cl">64 bytes from fd::101: seq=2 ttl=60 time=1.471 ms
</span></span></code></pre></div><p>However, name resolution still does not work or rather, DHCPv6 is ignored.
For this to work with Alpine, a small package needs to be installed. I should have mentioned this earlier, but then the suspense would have been ruined.
For this to work, Alpine needs dhcpcd. I will incorporate this into my template in the future.
Installation is super easy if you have internet access.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">apk update
</span></span><span class="line"><span class="cl">apk add dhcpcd
</span></span><span class="line"><span class="cl">rc-update add dhcpcd
</span></span></code></pre></div><p>Then restart the network service or simply restart the entire VM, which is faster for me—but then again, I&rsquo;m no Linux expert.
Wow, I finally have name resolution, but only if you have an IPv6-capable DNS, of course. In my environment, that&rsquo;s two Adguard Home instances. But I could also just use my OPNSense. Does the internet work now? Of course not. I get nice IPv6 name resolution and the routing works too, and yes, it&rsquo;s not because of firewall rules, no, NAT66 is missing.</p>
<p>Because the entire setup is based on ULA IPs due to the dynamic Telekom prefix, and these are not routable on the internet, we now have to resort to NAT again. I would have liked to use NAT NPTv6, but that is not possible with my setup.
I can&rsquo;t track the PPPoE interface, which means that every time I am assigned a new prefix by Telekom, I would have to manually adjust NAT NPTv6 like a caveman. So in the end, it will be stateful NAT from an IPv6 ULA address to an IPv6 GUA IP. The whole thing is comparable to normal SNAT with IPv4. NSX supports NAT66, but since NSX cannot be a prefix delegation client, that doesn&rsquo;t help me in my situation.</p>
<p>After configuring OPNSense Outbound NAT for IPv6 and the source IP range fd:11:22:33::/56, my internet is now working – what a struggle!</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">template:~# ping6 google.com
</span></span><span class="line"><span class="cl">PING google.com (2a00:1450:4001:804::200e): 56 data bytes
</span></span><span class="line"><span class="cl">64 bytes from 2a00:1450:4001:804::200e: seq=0 ttl=115 time=7.439 ms
</span></span><span class="line"><span class="cl">64 bytes from 2a00:1450:4001:804::200e: seq=1 ttl=115 time=7.570 ms
</span></span><span class="line"><span class="cl">64 bytes from 2a00:1450:4001:804::200e: seq=2 ttl=115 time=7.546 ms
</span></span></code></pre></div><h2 id="vcf9-and-ipv6">VCF9 and IPv6</h2>
<p>Does VCF9 support IPv6? The answer is a clear yes and no. The situation is complicated: individual components support dual stack or even native IPv6, but important core components either do not support IPv6 at all or the automatic provisioning does not support IPv6.
In the official documentation for VCF9, there are exactly 143 search hits for IPv6, which pretty much sums up the situation. VKS Supervisor, for example, is one of the components that does not support IPv6, and this is what determines the success or failure of my automation of modern workloads in VCF9. My recommendation would therefore be to continue to rely on IPv4 and use IPv6 as a dual stack for VM workloads at most.</p>
<h2 id="conclusion">Conclusion</h2>
<p>Was it stressful? That&rsquo;s a difficult question. On the one hand, I rediscovered a lot of forgotten knowledge about IPv6, and on the other hand, playing around with routing was really fun. It&rsquo;s a disappointment that VCF doesn&rsquo;t yet fully support IPv6. The same applies to VPCs. Since this will be the future standard, it&rsquo;s a bit strange that IPv6 isn&rsquo;t supported. You can assign IPv6 in the VPC connection profile, but it doesn&rsquo;t work in the VPC.</p>
<p>It&rsquo;s also a pity that I can only test it to a very limited extent. I definitely want to do more tests, including with IPv6 DHCP, and see if I can somehow use my dynamic prefix and assign normal GUA IPs. I also want to try a few things with AVI. As a backup plan, I also have a public VPS server with fixed IPv6 IPs, which I currently use as a Wireguard rendezvous server and as an internet breakout in case Telekom decides to disadvantage Cloudflare services again, but that&rsquo;s a whole other story.</p>
<p>It should be noted that even after 25 years, IPv6 is still not standard and IPv4 remains very important in data centers. Furthermore, it is somewhat disappointing that something as simple as DNS over RA is not yet supported by every operating system and every end device. I hope you enjoyed the article and feel encouraged to play around with IPv6 yourself.</p>
]]></content>
		</item>
		
		<item>
			<title>Nutanix - Get started with the CE Edition</title>
			<link>https://sdn-warrior.org/posts/nutanix-install-ce/</link>
			<pubDate>Sun, 08 Feb 2026 19:00:00 +0100</pubDate>
			
			<guid>https://sdn-warrior.org/posts/nutanix-install-ce/</guid>
			<description><![CDATA[Short blog about my experiences with Nutanix CE and which workarounds I needed.]]></description>
			<content type="html"><![CDATA[<h2 id="introduction">Introduction</h2>
<p>In November 2025, I was a guest at Nutanix Next in Darmstadt, where I met up with old colleagues and wanted to take a look at what Nutanix had to offer. Visitors were kindly given the opportunity to obtain certification on the spot, which I naturally took advantage of.</p>
<figure><picture><source srcset="/nutanix-ce/01_hu_f9ca88236445953f.jpg" type="image/jpeg">
          <img
            src="/nutanix-ce/01_hu_f9ca88236445953f.jpg"alt="Yes, this is my happy face"width="1988"
            height="1491"/>
        </picture><figcaption><p>Yes, this is my happy face</p></figcaption></figure>
<p>Having worked with VMware for over 20 years, I was naturally skeptical about what Nutanix could do, and what can I say, the differences are perhaps less than I thought. You could also say same, same but different—at least when it comes to the core of virtualization. Nutanix is based on KVM, which is rock solid and which I have been running in my home lab for ages in the form of Unraid. When it comes to storage, Nutanix takes a different approach and primarily relies on HCI and its own storage solution.</p>
<p>In any case, NEXT was decisive enough for me to take another closer look at Nutanix, because after all, there is a Community Edition that should cover most of the features. I would have liked to test the “real” Nutanix version, but unfortunately that&rsquo;s not easily possible. However, I am in talks with one or two Nutanix employees and perhaps a way can be found. That&rsquo;s why this isn&rsquo;t a feature comparison, because it would be like comparing apples and oranges if I were to compare the Community Edition of Nutanix with VCF9.</p>
<p>After all, there is a Community Edition of Nutanix, but VMware currently only offers ESXi8 Free. Personally, I would welcome a Community Edition from VMware. I know you can get licenses for VCF through VMUG Advantage and by passing an exam, but it&rsquo;s not the same. Phew, that was quite a long introduction, but let&rsquo;s get started.</p>
<h2 id="getting-started">Getting started</h2>
<p>At first, it wasn&rsquo;t that easy to find the CE Edition. You need a free Nutanix account, and then you can download the CE <a href="https://next.nutanix.com/discussion-forum-14/download-community-edition-38417">here</a>.</p>

    <aside class="admonition attention">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" class="feather feather-link" width="24" height="24" viewBox="0 0 24 24"
      fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
      <path d="M10 13a5 5 0 0 0 7.54.54l3-3a5 5 0 0 0-7.07-7.07l-1.72 1.71"></path>
      <path d="M14 11a5 5 0 0 0-7.54-.54l-3 3a5 5 0 0 0 7.07 7.07l1.71-1.71"></path>
   </svg></div><b>Download Link</b>
        </div>
        <div class="admonition-content">If you are not logged in to Nutanix, you will receive a 403 error and will need to click the login button on the 403 page.</div>
    </aside>
<p>Of course, the CE version has a few limitations, and of course I had to implement a few workarounds with my hardware to ensure that everything runs smoothly. But we are already used to that here, and it should come as no surprise to readers.</p>
<p>One limitation, for example, is the cluster size; Nutanix CE can only be deployed as a one, three, or four node cluster. Larger clusters or even two node clusters are reserved for the full version. The same applies to stretched clusters. In addition, the CE version only comes with AHV, which I personally don&rsquo;t find problematic; if I want to use ESX, I deploy VCF or VVF. I first tried Nutanix as a nested deployment, which worked fine, but the storage was really slow. That&rsquo;s not Nutanix&rsquo;s fault, though; it&rsquo;s the same with a nested vSAN cluster.</p>
<p>If you want to install the CE Edition bare metal, you unfortunately have to use a specific version of Rufus, otherwise Nutanix will not boot from a USB stick. I don&rsquo;t know exactly why this is the case. I tried Mac alternatives because I no longer have a Windows PC, but it didn&rsquo;t work. In the end, I got the image to run in a Windows 11 VM with USB passthrough and Rufus 3.21. Rufus 3.2.2 or newer do not work.</p>
<p>Another limitation of the CE Edition is that you must be online and have a Nutanix account. This must be entered after creating the cluster. Nutanix CE cannot be used without free online activation.</p>
<h2 id="hardware">Hardware</h2>
<p>To install Nutanix, you need at least three hard drives: one for booting, one for the CVM (hot tier must be flash), and one for cold data. I used NVMe storage for all three drives. I used three MS-01s with Intel i9 13th generation CPUs, 96 GB RAM, and 1x1TB and 2x2TB storage. Before the question arises, unfortunately I cannot use all 20 cores of the i9. Nutanix is the same as ESX in this respect; only the P+E cores without HT work. That makes a total of 14 vCPUs. I use 2x10Gb/s for the network, but less is also possible. However, you should not expect miracles in terms of performance if the network connection is less than 10Gb/s.</p>
<figure><a href="06.jpg"><picture><source srcset="/nutanix-ce/06_hu_f89b2cb82edbd9bd.jpg" type="image/jpeg">
          <img
            src="/nutanix-ce/06_hu_f89b2cb82edbd9bd.jpg"alt="MS-01"width="3344"
            height="2508"/>
        </picture></a><figcaption><p>MS-01</p></figcaption></figure>
<h2 id="network">Network</h2>
<p>While we&rsquo;re on the subject of networks, Nutanix uses the 192.168.5.0/24 network internally for communication between the hypervisor and CVM. I haven&rsquo;t found a way to change this (yet). If you have important external services such as DNS or NTP in this network area, it becomes difficult. Furthermore, the CVM and the hypervisor host must be in the same L2 network, and unfortunately, the CE Edition does not support VLANs for this. I managed to get my AHV host and my CVM to have a management VLAN, but the VLAN of the CVM is not persistent. After each reboot, the VLAN tag disappeared again. The problem seems to be that the VLAN tag settings are not written to cvm_config.json. I tried to do this manually, but I don&rsquo;t seem to be using the correct syntax. I need to test this further, and if I figure out how to do it, I will update this article. Another thing that may seem unusual to VMware users is that Nutanix initially uses all network adapters after installation, and this cannot be configured during installation.</p>
<h2 id="installation">Installation</h2>
<p>The installation is so simple and straightforward that I briefly considered not writing anything at all. But that&rsquo;s mainly because there&rsquo;s virtually nothing to configure.</p>
<figure><a href="02.jpg"><picture><source srcset="/nutanix-ce/02_hu_f1f4816dd63a9f3d.jpg" type="image/jpeg">
          <img
            src="/nutanix-ce/02_hu_f1f4816dd63a9f3d.jpg"alt="YInstall"width="1707"
            height="1166"/>
        </picture></a><figcaption><p>Installation</p></figcaption></figure>
<p>What you see in this screenshot is exactly what you can configure in the CE version. I use my smallest disk as the boot disk (H) and both 2TB NVMes as data or CVM disks. The installer usually selects the appropriate option. If you have different disk sizes (all flash), I would select the largest one as the data disk. Ideally, the cluster should be uniformly equipped, but this is not a must.</p>
<h2 id="cvm---what-is-that-anyway">CVM - What is that anyway?</h2>
<p>I should perhaps add a brief comment on this. One of my biggest criticisms was the complicated structure—why is there such a thing as a CVM, and why can&rsquo;t it be like vSAN? I had a very good conversation about this topic with <a href="https://www.linkedin.com/in/basraayman/">Bas Raayman</a> at Next and if anyone can explain it, it&rsquo;s him.</p>
<p>I hope I can summarize everything briefly. But on the one hand, the issue is historical. Nutanix started with ESXi as its hypervisor, and it was simply not allowed to bring the functionalities into the kernel via its own extensions. On the other hand, it also makes the platform hypervisor agnostic. Among other things, the CVM serves as a storage controller, provides IO paths, contains the cluster logic, and can provide a file server and certainly much more. The big advantage is that it is flexible. Nutanix runs on AHV, VMware, and the cloud. The CVM code base is always the same. Furthermore, VM metadata is stored in the CVM. This means I can easily migrate VMs between hypervisors. But where there is light, there is also shadow. Of course, this comes at the cost of some latency. According to Bas, it&rsquo;s hardly noticeable, but if I need to squeeze out every last bit of latency, then ESX is probably faster. The second problem, which I consider much more significant, is that users can log in to the CVM at any time and, in the case of the CE Edition, must do so. With a lot of root privileges comes a lot of responsibility. If you shut down the CVM or break it, the storage node will also fail. Overall, however, the advantages outweigh the disadvantages if you want to be hypervisor agnostic. And I must admit that at first I didn&rsquo;t understand the necessity of CVM and found the whole thing unnecessarily complicated. However, my opinion has actually changed in the meantime and I can now appreciate the charm of this solution.</p>
<p>Regardless, I will continue to favor NFS/iSCSI as the primary storage in my home lab wherever possible. Maybe I&rsquo;m just a little stubborn and old-fashioned in that regard.</p>
<h2 id="creating-the-cluster">Creating the cluster</h2>
<p>After installing all three hosts, I was faced with the problem that my servers were not accessible. Since it is not officially possible to configure a VLAN for management, I configured my switch ports to untagged. However, the reason why the servers were not accessible was not because of this, but because my first adapter on the MS-01 is also used for the integrated IPMI. By default, Nutanix builds an active/standby bond with all adapters and prefers ETH0 as the active adapter. Even if it only has 1 Gb/s. Since my IPMI also does not support VLAN tags and is located in a different LAN than my Nutanix cluster, I had a problem.</p>
<p>The solution is obvious: reconfigure the open vswitch. But how?</p>
<h3 id="changing-the-ovs-switch-uplinks">Changing the OVS Switch uplinks</h3>
<p>Now this is where the CVM comes into play, but how can you access the CVM if you can&rsquo;t reach it via the network? Thanks to IPMI, I have console access to my server, and I mentioned the 192.168.5.0/24 network earlier. Each CVM has the same internal IP, so you can jump to the CVM via SSH to 192.168.5.2. Each AHV host has 192.168.5.1. Each CVM has two interfaces. The external interface is assigned to the OVS on the default bridge br0, and the internal interface runs via a Linux bridge virbr0. To log in to the AHV host, you need the <em><strong>root</strong></em> user, and the CVM must be addressed with the user <em><strong>nutanix</strong></em>. Both users have the default password <em><strong>nutanix/4u</strong></em>.</p>
<p>After logging in, you can change the uplinks of the default br0 using the following command. In my case, I only want to use the two 10 Gb/s interfaces of my MS-01.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">manage_ovs --bridge_name br0 --interfaces eth2,eth3 --bond_mode active-backup update_uplinks
</span></span></code></pre></div>
    <aside class="admonition attention">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" class="feather feather-link" width="24" height="24" viewBox="0 0 24 24"
      fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
      <path d="M10 13a5 5 0 0 0 7.54.54l3-3a5 5 0 0 0-7.07-7.07l-1.72 1.71"></path>
      <path d="M14 11a5 5 0 0 0-7.54-.54l-3 3a5 5 0 0 0 7.07 7.07l1.71-1.71"></path>
   </svg></div><b>Modifying the network</b>
        </div>
        <div class="admonition-content">If the host is assigned to an existing cluster, the host should be in maintenance mode and the CVM should be shut down. Otherwise, significant problems may occur. Since no cluster has been initialized yet, this step is not necessary.</div>
    </aside>
<p>After this change, the host and CVM are now accessible from my admin desktop. This must now also be done with the other two hosts.</p>
<h3 id="workaround-due-to-identical-uuids-of-the-hosts">Workaround due to identical UUIDs of the hosts</h3>
<p>This brings me to the first real problem with the installation, and I noticed the error very late in the process. Unfortunately, all my MS-01 servers are assigned the same UUID. By default, the default source for the UUID is smbios, but all of my MS-01s have the same UUID. This should not happen and is clearly due to my hardware. I cannot say at this point whether this also happens with an MS-A2, but at least all 6 of my MS01s are affected.</p>
<p>The annoying thing is that I only notice this when a VM needs to be migrated. I can create a cluster without any problems, execute the lifecycle, and even Prism Central deployment works fine, but as soon as a VM needs to be migrated, the problems start.</p>
<p>I received the following error message.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">Operation failed: internal error: Attempt to migrate guest to the same host 03000200-0400-0500-0006-000700080009: 61
</span></span></code></pre></div><p>And if you know a little bit about this, you&rsquo;ll notice that the reason is already stated in the error message. All hosts have the UUID 03000200-0400-0500-0006-000700080009. As a result, Nutanix assumes that the VM should be migrated to the same host where it is running and therefore aborts the process.</p>
<p>If you are logged in as root on the AHV, you can use the following command to display the UUID.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">cat /sys/class/dmi/id/product_uuid
</span></span></code></pre></div><p>If this is the same on all hosts, then congratulations, the following workaround is necessary.
First, we generate fresh UUIDs that are unique. This can be done with the following command on the AHV. Three can be generated at once.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">uuidgen -r
</span></span></code></pre></div><p>It&rsquo;s best to save them, because unfortunately the fix isn&rsquo;t 100% permanent, but more on that later.
Next, the following file must be modified.
<em><strong>/etc/libvirt/libvirtd.conf</strong></em>
This can be done with nano or vi. Since I am more comfortable with nano, it is always my tool of choice, if available.
The fix is quite simple: a line needs to be commented out relatively far down, and the UUID needs to be replaced with a valid one.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl"># NB This default all-zeros UUID will not work. Replace
</span></span><span class="line"><span class="cl"># it with the output of the &#39;uuidgen&#39; command and then
</span></span><span class="line"><span class="cl"># uncomment this entry
</span></span><span class="line"><span class="cl">host_uuid = &#34;c39808fa-8255-4c46-84ee-84e3ad5d16f0&#34;
</span></span><span class="line"><span class="cl">#host_uuid_source = &#34;smbios&#34;
</span></span></code></pre></div><p>Finally, restart the libvirt service and we are ready to go.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">systemctl restart libvirtd
</span></span></code></pre></div>
    <aside class="admonition attention">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" class="feather feather-link" width="24" height="24" viewBox="0 0 24 24"
      fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
      <path d="M10 13a5 5 0 0 0 7.54.54l3-3a5 5 0 0 0-7.07-7.07l-1.72 1.71"></path>
      <path d="M14 11a5 5 0 0 0-7.54-.54l-3 3a5 5 0 0 0 7.07 7.07l1.71-1.71"></path>
   </svg></div><b>Persistence of the workaround</b>
        </div>
        <div class="admonition-content">Unfortunately, the workaround is not 100% persistent. Rebooting and completely shutting down the cluster or environment is not a problem, but when upgrading to a new AHV version, the file is overwritten with defaults again. A 100% permanent solution would probably only be a BIOS update where the manufacturer fixes the problem with the UUID.</div>
    </aside>
<h3 id="creating-the-cluster---now-really">Creating the cluster - now really!</h3>
<p>Once network connectivity has been established and our UUIDs are once again unique, the cluster can be created. Since there is no GUI available for us to use at this point, this must also be done manually via the CLI. Don&rsquo;t worry, it&rsquo;s quick and painless.
To do this, you must log in to any CVM via nutanix User. This should now also be possible via the configured IP from the installer.</p>
<p>The cluster creation process is simple and can be done quickly using the following CLI command.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">cluster -s cvm1_ip_addr,cvm2_ip_addr,... create
</span></span></code></pre></div><p>After a certain amount of time, the cluster status can be checked with this command.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">cluster status
</span></span></code></pre></div>
    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>DNS, NTP, Cluster VIP</b>
        </div>
        <div class="admonition-content">The CE edition comes with public DNS and NTP server entries. If you don&rsquo;t want this, you can change it via the cluster&rsquo;s GUI or directly via CLI.
Without adjustment, the cluster starts up without Cluster VIP, which must then be adjusted afterwards.</div>
    </aside>
<p>Additional settings via the CLI.</p>
<p>Define the cluster name.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">ncli cluster edit-params new-name=cluster_name
</span></span></code></pre></div><p>Configure a name server for the cluster.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">ncli cluster add-to-name-servers servers=public_name_server_ip_address
</span></span></code></pre></div><p>Configure an external IP address for the cluster.</p>
<p>(This parameter is required for a CE cluster.)</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">ncli cluster set-external-ip-address external-ip-address=cluster_ip_address
</span></span></code></pre></div><p>Add an NTP server IP address to the list of NTP servers.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">ncli cluster add-to-ntp-servers servers=NTP_server_ip_address
</span></span></code></pre></div><p>After completing all these steps, you should now be able to log in to the Nutanix cluster via the ClusterIP.</p>
<p>With Prism Elements, Nutanix provides a web interface that can be accessed via any host and via the cluster IP port 9440. This means that it is not necessarily required to deploy Prism Central. If you intend to use Nutanix as a pure hypervisor without Nutanix Flow and other services, Prism Elements is entirely sufficient. If you want to use different clusters or Nutanix Flow (which I am very interested in), then Prism Central is a must. It can be easily deployed as a VM in the cluster via Prism Element. I will describe this further down in the article.</p>
<figure><a href="03.jpg"><picture><source srcset="/nutanix-ce/03_hu_c88bac9ea269e849.jpg" type="image/jpeg">
          <img
            src="/nutanix-ce/03_hu_c88bac9ea269e849.jpg"alt="Prism Element"width="1718"
            height="915"/>
        </picture></a><figcaption><p>Prism Element</p></figcaption></figure>
<p>When logging in for the first time at the Prism Elemet GUI with the admin <em><strong>user</strong></em> and the password <em><strong>nutanix/4u</strong></em>, this must be changed.
And while we&rsquo;re on the subject of changing passwords, this should also be done for all other accounts.
To change the passwords for all accounts, there is a KB article from Nutanix that describes how to do this via the CVM. I strongly recommend changing all default accounts.</p>
<h3 id="default-paasword-changes">Default Paasword changes</h3>
<p>The following three commands help to change the password for the root, admin, and nutanix accounts at the AHV host level and on all the hosts in the cluster. Do not modify the command. It will ask for the new password twice and will not display it. This can run from any CVM in the cluster.</p>
<p>Change CVM local account</p>
<p>nutanix</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">sudo passwd nutanix
</span></span></code></pre></div><p>Change AHV local accounts</p>
<p>root</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">echo -e &#34;CHANGING ALL AHV HOST ROOT PASSWORDS.\nPlease input new password: &#34;; read -rs password1; echo &#34;Confirm new password: &#34;; read -rs password2; if [ &#34;$password1&#34; == &#34;$password2&#34; ]; then for host in $(hostips); do echo Host $host; echo $password1 | ssh root@$host &#34;passwd --stdin root&#34;; done; else echo &#34;The passwords do not match&#34;; fi
</span></span></code></pre></div><p>admin</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">echo -e &#34;CHANGING ALL AHV HOST ADMIN PASSWORDS.\nPlease input new password: &#34;; read -rs password1; echo &#34;Confirm new password: &#34;; read -rs password2; if [ &#34;$password1&#34; == &#34;$password2&#34; ]; then for host in $(hostips); do echo Host $host; echo $password1 | ssh root@$host &#34;passwd --stdin admin&#34;; done; else echo &#34;The passwords do not match&#34;; fi
</span></span></code></pre></div><p>nutanix</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">echo -e &#34;CHANGING ALL AHV HOST NUTANIX PASSWORDS.\nPlease input new password: &#34;; read -rs password1; echo &#34;Confirm new password: &#34;; read -rs password2; if [ &#34;$password1&#34; == &#34;$password2&#34; ]; then for host in $(hostips); do echo Host $host; echo $password1 | ssh root@$host &#34;passwd --stdin nutanix&#34;; done; else echo &#34;The passwords do not match&#34;; fi
</span></span></code></pre></div><h3 id="additional-settings">Additional settings</h3>
<p>I switched my default network switch vs0 to jumbo frames. This can be done via the dropdown menu (in Prism Element) under <em><strong>Settings -&gt; Network Configuration -&gt; Virtual Switch -&gt; vs0</strong></em> and the Edit icon. Here, the bond and uplinks can also be adjusted.</p>
<figure><a href="05.jpg"><picture><source srcset="/nutanix-ce/05_hu_9fc1ef9b59b7cbff.jpg" type="image/jpeg">
          <img
            src="/nutanix-ce/05_hu_9fc1ef9b59b7cbff.jpg"alt="Network"width="1717"
            height="821"/>
        </picture></a><figcaption><p>vs0 network settings</p></figcaption></figure>
<p>Nutanix offers two options: Standard or Quick.
In Standard mode, the hosts are put into maintenance mode beforehand; in Quick mode, the changes are implemented immediately. If something is wrong with the physical configuration, the workload would be offline as a result.
Nutanix recommends Standard mode here, as adjusting the network also affects storage traffic. This always runs via the vs0 switch. If you have more than two adapters, you can also separate the traffic via an additional switch (alternatively, you can configure a bond without failover and separate traffic with two network types).</p>
<p>In addition, under <em><strong>Settings -&gt; SSL Certificate</strong></em>, I replaced the Prism Element certificate with my own CA from my root CA. Here, all IPs and FQDNs of all cluster nodes and the cluster VIP must be entered under SAN. I also switched the UI to dark mode and English. This can be done under <em><strong>Settings -&gt; UI Settings</strong></em> and Language Settings.</p>
<h2 id="lifecycle">Lifecycle</h2>
<p>The lifecycle can be managed via Prism Element or Prism Central. Prism Central has its own lifecycle independent of Prism Element. Nutanix CE comes in AOS version 6.9.1. Currently, the maximum upgrade available via Lifecycle is to AOS version 7.3.1.2.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Current Bug with 7.5.x.x</b>
        </div>
        <div class="admonition-content">There is currently a problem with unofficially supported hardware hosts and Nutanix AOS versions higher than 7.3.x.x. In addition, if your host is affected by the UUID problem, the workaround must be applied after every AHV upgrade. I therefore first set the lifecycle to 7.3.x and then deployed Prism Central. To do this, I shut down all VMs (except for the CVM) so that I would not encounter any errors during the upgrade.</div>
    </aside>
<h2 id="deploying-nutanix-prism">Deploying Nutanix Prism</h2>
<p>To deploy Prism Central, I first created a new storage container. A storage container is a logical construct to which I can apply policies and where I can create my vDisks for VMs. It is not a volume or LUN. Each storage container can have different policies and therefore behave differently, but all storage containers run on the same storage pool (in my lab, I only have one storage pool consisting of the resources of the three servers).</p>
<p>To create the storage container, I switch to Storage via the dropdown menu and create a new container called Workload. Compression and an error tolerance of 1N/1D are set by default. This means that 1 node from the cluster can fail.
I could also set the advertised capacity or reserve capacity. I leave everything at default here. Unfortunately, erasure coding does not work because I would need at least 4 hosts for that. It&rsquo;s a sad, because vSAN ESA is a little more flexible here, allowing me to use Raid 5 erasure coding with as few as 3 hosts.</p>
<figure><a href="04.jpg"><picture><source srcset="/nutanix-ce/04_hu_580283e627807a4.jpg" type="image/jpeg">
          <img
            src="/nutanix-ce/04_hu_580283e627807a4.jpg"alt="Storage Container"width="1715"
            height="1221"/>
        </picture></a><figcaption><p>Storage Container</p></figcaption></figure>
<p>To deploy Prism Central, simply click on the Nutanx Prism Element Dashboard (top left) and select “Deploy New Prism Central.”  The dialog box will display all compatible versions, which you can then download. I opted for the latest version. I deployed Prism Central in the small form factor because I want to use Nutanix Flow later, and that doesn&rsquo;t work with extra-small.
Using Create Network, I created a new VLAN network on the default vSwitch vs0. This VLAN must be routed and must be able to communicate with the CVMs and the AHV server. It must also be present on the physical switches. The other settings are very self-explanatory. IP, gateway IP, DNS, NTP, and storage container should be known and available. I decided not to use an HA deployment due to the resources required. Finally, press Deploy and then it&rsquo;s time for 1-2 coffees, because the installation can take a good 30-40 minutes.</p>
<p>Once Prism Central has been successfully deployed, the cluster still needs to be registered. This is done via Prism Element in the same place where the deployment was started. I will describe the rest of the Prism Central setup and the post-setup steps in a separate article—sometime in the future.</p>
<h2 id="safely-shut-down-the-environment-and-restart-it">Safely shut down the environment and restart it.</h2>
<p>Since I do not have my lab environments running continuously, I shut down my environments regularly, and as with VCF, this must be done in a fixed order.</p>
<ul>
<li>Shutting down all workload VMs via Prism Central or Elements.</li>
<li>Log in to Prism Central via SSH with the nutanix user and enter <em><strong>cluster stop</strong></em>.</li>
<li>Wait until the message “Success” appears. You can check the status using <em><strong>cluster status</strong></em>. Even if Prism Central is not a clustered installation, this is the safest option.</li>
<li>Shut down Prism Central with <em><strong>sudo shutdown now</strong></em>.</li>
<li>Log in to Prism Element and wait until Prism Central VM has shut down.</li>
<li>Log in via SSH and the nutanix user on any CMV and execute <em><strong>cluster stop</strong></em>.</li>
<li>Wait until the message “Success” appears. You can check the status using <em><strong>cluster status</strong></em>.</li>
<li>Log in to all CVMs via SSH and nutanix user and shut them down with <em><strong>sudo shutdown now</strong></em></li>
<li>Log in to all AHV servers via SSH and root</li>
<li>Check with <em><strong>virsh list</strong></em> whether any VMs are still running</li>
<li>Once all VMs have been stopped, shut down each AHV host with <em><strong>shutdown now</strong></em></li>
</ul>
<p>Starting the cluster is done in exactly the opposite order. Instead of <em><strong>cluster stop</strong></em>, <em><strong>cluster start</strong></em> must be entered. Simply booting up the Prism Central VM is not enough; here too, the cluster must be started explicitly with cluster start.</p>
<h2 id="summary">Summary</h2>
<p>At first glance, a lot of things are different from VMware. But what I particularly like, and haven&rsquo;t mentioned in this article yet, is that as a home lab user, you get a really well-rounded and great software package for free. The CE also includes the Kubernetes Engine and Flow, which is essentially Nutanix&rsquo;s alternative to NSX and VKS. Overall, I really liked how easy it was to install Flow or Move (the HCX alternative) via a kind of app store.</p>
<p>I have to admit that the installation process wasn&rsquo;t exactly easy. In this article, I&rsquo;ve only written down the results of several days of trying and searching for the problem. Nutanix can&rsquo;t do anything about the UUID problem, but I find it difficult to understand why the AHV installation process doesn&rsquo;t simply support VLANs like ESX does. However, I also know that these are problems with the CE Edition and do not affect Nutanix Fusion.</p>
<p>But it&rsquo;s not like you&rsquo;re faced with unsolvable problems. The Nutanix forum is full of solutions and help, you just have to find it. It also doesn&rsquo;t hurt to be familiar with KVM and Open vSwitch. I have experience with both thanks to my work with Proxmox and Unraid.</p>
<p>The fact that you need three hard drives could also be an obstacle. I wouldn&rsquo;t have been able to get Nutanix to run on a Nuc. Maybe it&rsquo;s possible to boot from a USB stick, I&rsquo;ll have to test that. Fortunately, nested deployment is also an option if you can live with the lower performance of the storage.</p>
<p>In the future, I will write more articles about my small test cluster because I definitely want to try Move and Flow.
I hope that anyone who was hoping I would write a scathing review isn&rsquo;t too disappointed, but to be fair, I can only compare free software with free software, and VMware would currently come up short. I&rsquo;m curious to see how this topic develops and what else I will write about it.</p>
]]></content>
		</item>
		
		<item>
			<title>AVI - Connect Keycloak to AVI Loadbalancer</title>
			<link>https://sdn-warrior.org/posts/avi-keycloak/</link>
			<pubDate>Tue, 27 Jan 2026 21:30:00 +0100</pubDate>
			
			<guid>https://sdn-warrior.org/posts/avi-keycloak/</guid>
			<description><![CDATA[A short blog post on how to connect Keycloak to the AVI load balancer.]]></description>
			<content type="html"><![CDATA[<h2 id="introduction">Introduction</h2>
<p>Today, I would like to address the topic of how Keycloak can be used as a central identity management system with the Advanced Load Balancer.
In theory, this is relatively easy to implement, but in practice, there are a few challenges.
I would like to describe exactly what these are and also present a solution.
As a little bonus, I will also show you how to use a Fido Key (Yubikey) to log in.</p>
<p>I must admit, however, that I haven&rsquo;t had such a frustrating setup in a long time, and it involved a lot of trial and error.
Now, OIDC isn&rsquo;t exactly one of my favorite topics, and I only had rudimentary experience with Keycloak, but that shouldn&rsquo;t stop me from trying something new. Right? Right! So let&rsquo;s get started.</p>
<h2 id="setup-keycloak">Setup Keycloak</h2>
<p>At this point, I made it relatively easy for myself by installing the official Keycloak Docker version 26.5.2 on my Unraid.
Keycloak requires a Postgres DB, so I used an existing one—also on Unraid.
If you were hoping that I would write instructions on how to install Keycloak here, I&rsquo;m afraid I have to disappoint you.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">Passkeys will only be fully supported in version 26.4.0 and above. The feature has been available as a preview since version 23.0.0.</div>
    </aside>
<p>Since I want to use OKTA as a provider in my VCF setup later on, my Keycloak needs a valid certificate that can be publicly validated.
Here, too, I have made it easy for myself and am using my NGINX proxy manager with Let&rsquo;s Encrypt support and a certificate from one of my domains.
So that I don&rsquo;t have to store my private IPs in Cloudflare&rsquo;s DNS, I use SplitDNS at home. This means I can resolve keycloak.evilcorp.info, but you cannot.
Since most of my Docker containers only run internally with HTTP, I have been using the proxy method for a long time and it has proven to be robust and reliable for me.</p>
<h3 id="realm-setup">Realm Setup</h3>
<p>Now that you have somehow deployed Keycloak on your system, we can start with the actual configuration. To do this, you first need to create a new realm.
A realm is a client, you could say. A realm has its own users, groups, and so on.
In addition, you can define your own identity provider for each realm. In my case, for the sake of simplicity, I use Microsoft AD (because it&rsquo;s there anyway), but I could also use LinkedIn, Facebook, GitHub, Google, and many other identity providers. A combination of several is also possible.
The only important thing is that the username must be unique within the realm. No two identical usernames can exist.</p>
<p>A new realm can be easily created under Manage realms and only requires a name. I named my realm lab-ad.</p>
<h3 id="ldap-connection">LDAP Connection</h3>
<p>I deliberately avoided using LDAPs in my lab because my container does not have persistent storage, which means I cannot easily link the RootCA of my domain.
This is on my long to-do list, but you can build it properly.
An AD is integrated under User federation and not under Identity providers. It sounds strange, but that&rsquo;s how it is.</p>
<p>Here are my settings at a glance:</p>
<p>General options</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">UI display name ldap
</span></span><span class="line"><span class="cl">Vendor Active Directory
</span></span></code></pre></div><p>Connection and authentication settings</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">Connection URL ldap://home.lab
</span></span><span class="line"><span class="cl">Enable StartTLS Off
</span></span><span class="line"><span class="cl">Use Truststore SPI Never
</span></span><span class="line"><span class="cl">Connection pooling On
</span></span><span class="line"><span class="cl">Connection timeout 
</span></span><span class="line"><span class="cl">Bind type simple
</span></span><span class="line"><span class="cl">Bind DN CN=keycloak svc,OU=ServiceAccounts,DC=home,DC=lab
</span></span><span class="line"><span class="cl">Bind credentials ••••••••••
</span></span></code></pre></div><p>LDAP searching and updating</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">Edit mode READ_ONLY
</span></span><span class="line"><span class="cl">Users DN OU=User,DC=home,DC=lab
</span></span><span class="line"><span class="cl">Relative user creation DN 
</span></span><span class="line"><span class="cl">Username LDAP attribute sAMAccountName
</span></span><span class="line"><span class="cl">RDN LDAP attribute cn
</span></span><span class="line"><span class="cl">UUID LDAP attribute objectGUID
</span></span><span class="line"><span class="cl">User object classes user
</span></span><span class="line"><span class="cl">User LDAP filter (&amp;(objectClass=user)(!(objectClass=computer)))
</span></span><span class="line"><span class="cl">Search scope One Level
</span></span><span class="line"><span class="cl">Read timeout 
</span></span><span class="line"><span class="cl">Pagination On
</span></span><span class="line"><span class="cl">Referral 
</span></span></code></pre></div><p>Synchronization settings</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">Import users On
</span></span><span class="line"><span class="cl">Sync Registrations On
</span></span><span class="line"><span class="cl">Batch size 
</span></span><span class="line"><span class="cl">Periodic full sync Off
</span></span><span class="line"><span class="cl">Periodic changed users sync On
</span></span><span class="line"><span class="cl">Changed users sync period 60
</span></span><span class="line"><span class="cl">Remove invalid users during searches On
</span></span></code></pre></div><p>Kerberos integration</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">Allow Kerberos authentication Off
</span></span><span class="line"><span class="cl">Use Kerberos for password authentication Off
</span></span></code></pre></div><p>Cache settings</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">Cache policy DEFAULT
</span></span></code></pre></div><p>Advanced settings</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">Enable the LDAPv3 password modify extended operation Off
</span></span><span class="line"><span class="cl">Validate password policy Off
</span></span><span class="line"><span class="cl">Trust Email Off
</span></span><span class="line"><span class="cl">Connection trace Off
</span></span></code></pre></div><p>This configuration connects Keycloak to an AD server at ldap://home.lab and authenticates there with a service account (CN=keycloak svc,OU=ServiceAccounts), allowing connection and read-only access.
Only users in OU=User,DC=home,DC=lab are used, sAMAccountNameis used as the login name, computer objects are filtered out, and the users found are imported into the local database, with changes synchronized every 60 seconds and deleted AD users automatically removed.</p>
<p>Next, a group-ldap-mapper must be created in the Mappers tab. To do this, simply click on create new mapper and then select the type group-ldap-mapper.</p>
<p>My mapper is called ad-groups and these are the settings:
ad-groups</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">LDAP Groups DN OU=SecGroups,DC=home,DC=lab
</span></span><span class="line"><span class="cl">Relative creation DN 
</span></span><span class="line"><span class="cl">Group Name LDAP Attribute cn
</span></span><span class="line"><span class="cl">Group Object Classes group
</span></span><span class="line"><span class="cl">Preserve Group Inheritance Off
</span></span><span class="line"><span class="cl">Ignore Missing Groups Off
</span></span><span class="line"><span class="cl">Membership LDAP Attribute member
</span></span><span class="line"><span class="cl">Membership Attribute Type DN
</span></span><span class="line"><span class="cl">Membership User LDAP Attribute sAMAccountName
</span></span><span class="line"><span class="cl">LDAP Filter 
</span></span><span class="line"><span class="cl">Mode READ_ONLY
</span></span><span class="line"><span class="cl">User Groups Retrieve Strategy LOAD_GROUPS_BY_MEMBER_ATTRIBUTE
</span></span><span class="line"><span class="cl">Member-Of LDAP Attribute memberOf
</span></span><span class="line"><span class="cl">Mapped Group Attributes 
</span></span><span class="line"><span class="cl">Drop non-existing groups during sync Off
</span></span><span class="line"><span class="cl">Groups Path /
</span></span></code></pre></div><p>These settings tell Keycloak where and how LDAP groups are read: It searches for groups under OU=SecGroups,DC=home,DC=lab, recognizes them as group objects with names from cn, and reads membership via the member attribute (DN-based, users via sAMAccountName).
Keycloak works read-only, loads groups via the member attribute, does not automatically transfer missing/unmapped groups, and stores the imported groups under / in the Keycloak group tree.</p>
<p>Once LDAP is connected, users should now be found when searching for a user under Users.</p>
<figure><a href="01.png"><picture><source srcset="/avi-keycloak/01_hu_3b7296b05b56aeda.png" type="image/png">
          <img
            src="/avi-keycloak/01_hu_3b7296b05b56aeda.png"alt="User"width="3448"
            height="1972"/>
        </picture></a><figcaption><p>Imported AD User in Keycloak (click to enlarge)</p></figcaption></figure>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">For my test, I created two users, one named “daniel” and one named “dummy.” There are also two AD groups, one named “vcf-admin” and one named “vcf-avi.” My user “daniel” is in both groups. The dummy user is only in the vcf group and serves as a control instance.
Access will be regulated later with the groups. Users in the VCF group have access to my vCenter, users without groups have no access, and users in the vcf-avi group have access to the AVI. Does that make sense?</div>
    </aside>
<p>Next, I create an Auth Profile in AVI. AVI is a bit special in this regard because the Keycloak client must be created with a very specific name.</p>
<h3 id="auth-profile-avi">Auth Profile AVI</h3>
<p>In AVI, navigate to Templates -&gt; Security -&gt; Auth Profile and create a new SAML profile.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">AVI also offers OAuth/OIDC, but I couldn&rsquo;t get it to work with any settings.
The documentation only mentions that SAML should be used with Workspace One (which we don&rsquo;t want). OIDC is only described in the documentation for virtual services. Fortunately, the Swiss Army knife Keycloak also supports SAML.</div>
    </aside>
<figure><a href="02.png"><picture><source srcset="/avi-keycloak/02_hu_110a7a4e439270c8.png" type="image/png">
          <img
            src="/avi-keycloak/02_hu_110a7a4e439270c8.png"alt="SAML"width="1245"
            height="1063"/>
        </picture></a><figcaption><p>SAML Settings AVI (click to enlarge)</p></figcaption></figure>
<p>Two things are important here: first, that the IDP metadata URL is used, and second, that the service provider is switched to FQDN.
The IDP metadata URL can be found in Keycloak under Realm settings -&gt; General -&gt; Endpoints SAML 2.0 Identity Provider Metadata.
Simply click on the browser link and copy the URL from the browser into AVI.</p>
<p>Once the profile has been saved, click on the context menu with the three dots in the Auth Profile overview and select Verify.
The Verify view displays the Entity ID and the SSO URL—very intuitive—not.</p>

    <aside class="admonition attention">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" class="feather feather-link" width="24" height="24" viewBox="0 0 24 24"
      fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
      <path d="M10 13a5 5 0 0 0 7.54.54l3-3a5 5 0 0 0-7.07-7.07l-1.72 1.71"></path>
      <path d="M14 11a5 5 0 0 0-7.54-.54l-3 3a5 5 0 0 0 7.07 7.07l1.71-1.71"></path>
   </svg></div><b>Attention</b>
        </div>
        <div class="admonition-content">There are now two very important parameters here.
The Entity ID, which must become the Client ID in Keycloak, and the Single Sign-on URL. If either of these is incorrect in the Client Settings, SSO will not work.</div>
    </aside>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">Entity ID: AviController-vcf09-avi.lab.vcf
</span></span><span class="line"><span class="cl">SSO URL: https://vcf09-avi.lab.vcf/sso/acs/
</span></span></code></pre></div><h3 id="keycloak-client">Keycloak Client</h3>
<p>Keycloak Clients are applications and services that can request authentication of a user.
You can have multiple clients in one realm, and clients can have different settings.
Such as client-specific roles, for example. I&rsquo;ll come back to that later, because AVI has a few more peculiarities.
A client is created in the realm under Clients.</p>
<p>Create client</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">Client type SAML
</span></span><span class="line"><span class="cl">Client ID AviController-vcf09-avi.lab.vcf
</span></span><span class="line"><span class="cl">Name AVI - Test
</span></span><span class="line"><span class="cl">Description 
</span></span><span class="line"><span class="cl">Always display in UI Off
</span></span><span class="line"><span class="cl">Root URL 
</span></span><span class="line"><span class="cl">Home URL 
</span></span><span class="line"><span class="cl">Valid redirect URIs https://vcf09-avi.lab.vcf/sso/acs/
</span></span><span class="line"><span class="cl">Valid post logout redirect URIs https://vcf09-avi.lab.vcf/sso/acs/
</span></span><span class="line"><span class="cl">IDP-Initiated SSO URL name 
</span></span><span class="line"><span class="cl">IDP Initiated SSO Relay State 
</span></span><span class="line"><span class="cl">Master SAML Processing URL 
</span></span></code></pre></div><p>Those are the basic settings. Next, a few more settings need to be adjusted. To do this, open the client settings—the range of options is overwhelming at first, but fortunately, not much needs to be changed.</p>
<p>The first setting must be made in Signature and Encryption. Here, Sign assertions must be set to On.
In addition, the name ID format must be changed to email (every user must have an email address in the AD).</p>
<p>A theme can also be selected under Loging settings.
Next, in the Tab Keys, I had to disable the client signature. The problem only occurred when I changed the AVI SSL/TLS Profile and no longer allowed weak ciphers.
I haven&rsquo;t yet figured out exactly where the problem lies.
I imported the certificate as described in the AVI documentation, but it only works for me if I allow weak ciphers in AVI.
If anyone has any ideas, I would be grateful for any input.</p>
<p>Under Roles, I create a new client role called avi-access. I&rsquo;ll explain what this does a little later.
Under Client Scopes, you have to create a Group Mapper, otherwise AVI won&rsquo;t know anything about the groups.
To do this, it&rsquo;s important to go to the Dedicated scopes, which are named the same as the client, only with -dedicated.</p>
<figure><a href="03.png"><picture><source srcset="/avi-keycloak/03_hu_e79984bd388bad18.png" type="image/png">
          <img
            src="/avi-keycloak/03_hu_e79984bd388bad18.png"alt="Client Scope - dedicvated"width="3446"
            height="1962"/>
        </picture></a><figcaption><p>Client Scope - dedicated  (click to enlarge)</p></figcaption></figure>
<p>Here, a new mapper must be configured. To do this, press the Configure new mapper button and select a mapper of type Group list.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">Group attribute name groups
</span></span><span class="line"><span class="cl">Friendly Name AD Groups
</span></span><span class="line"><span class="cl">SAML Attribute NameFormat Basic
</span></span><span class="line"><span class="cl">Single Group Attribute On
</span></span><span class="line"><span class="cl">Full group path Off
</span></span></code></pre></div><p>The Group Mapper in the “Client Dedicated” area ensures that when logging in, the user&rsquo;s groups are written to the token/SAML assertion for that specific client.

    <aside class="admonition attention">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" class="feather feather-link" width="24" height="24" viewBox="0 0 24 24"
      fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
      <path d="M10 13a5 5 0 0 0 7.54.54l3-3a5 5 0 0 0-7.07-7.07l-1.72 1.71"></path>
      <path d="M14 11a5 5 0 0 0-7.54-.54l-3 3a5 5 0 0 0 7.07 7.07l1.71-1.71"></path>
   </svg></div><b>Attention</b>
        </div>
        <div class="admonition-content">So it is important to note that AVI expects a single group attribute. If this is not activated, a separate attribute will be written for each group that a user has in Keycloak. The login will then no longer work if the user is a member of more than one group.</div>
    </aside></p>
<p>AVI will need the attribute later for the mapping profile. In the mapping profile, the attribute must have the same name as it was created in Keycloak—in my case, that is “groups”.</p>
<p>The SSO URL must be entered under Advanced.</p>
<p>Fine Grain SAML Endpoint Configuration</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">This section to configure exact URLs for Assertion Consumer and Single Logout Service.
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">Logo URL 
</span></span><span class="line"><span class="cl">Policy URL 
</span></span><span class="line"><span class="cl">Terms of service URL 
</span></span><span class="line"><span class="cl">Assertion Consumer Service POST Binding URL https://vcf09-avi.lab.vcf/sso/acs/
</span></span><span class="line"><span class="cl">Assertion Consumer Service Redirect Binding URL https://vcf09-avi.lab.vcf/sso/acs/
</span></span><span class="line"><span class="cl">Logout Service POST Binding URL 
</span></span><span class="line"><span class="cl">Logout Service Redirect Binding URL 
</span></span><span class="line"><span class="cl">Logout Service SOAP Binding URL 
</span></span><span class="line"><span class="cl">Logout Service ARTIFACT Binding URL 
</span></span><span class="line"><span class="cl">Artifact Binding URL 
</span></span><span class="line"><span class="cl">Artifact Resolution Service 
</span></span></code></pre></div><p>This completes the initial configuration of the client. However, do not be concerned, as we will need to configure more later on.</p>
<h3 id="role-mapping-for-avi">Role mapping for AVI</h3>
<p>Next, we use the aforementioned client role. This role can only be used in the AVI client in Keycloak.
With this, we will prevent a peculiarity of AVI. AVI has a strange habit of creating a user in AVI as soon as authentication is complete, regardless of whether the user has access or not.
Currently, all I have to do is create a valid Keycloak session for a user, who is then created in AVI and receives an error message because they are not authorized.
The only problem is that this creates a defective SAML cookie and you can no longer log in to AVI, even if you try with a valid user. Great, I can already see the tickets coming in.
The solution is to delete the cookie in the browser and end the session in Keycloak. After that, you can log in again. Of course, you now have a user without authorization in AVI, which you must delete manually.</p>
<figure><a href="04.png"><picture><source srcset="/avi-keycloak/04_hu_9aceaea5572d90b3.png" type="image/png">
          <img
            src="/avi-keycloak/04_hu_9aceaea5572d90b3.png"alt="SAML Cookie broken"width="1160"
            height="737"/>
        </picture></a><figcaption><p>SAML Cookie broken (click to enlarge)</p></figcaption></figure>
<p>In general, this behavior is not a security issue, but user complaints are inevitable.
To prevent this, I am building in a check to see whether the user is authorized to access the AVI.
The check is performed at the Keycloak level, and the request is not sent to the AVI at all if there is no authorization for AVI.</p>
<p>To do this, a role mapping must be created in Keycloak. Under Groups -&gt; vcf-avi -&gt; Role Mapping, a role can be assigned to the group in Keycloak. The groups and user membership come from the AD, making it easy to manage centrally.
To assign the role, click Assign role and select the appropriate client role avi-access via client roles.</p>
<h3 id="auth-mapping-profile-for-avi">Auth Mapping Profile for AVI</h3>
<p>I promise, we&rsquo;ll soon be able to log in with Keycloak. An Auth Mapping Profile is still required in AVI.
This can be created under Templates -&gt; Security -&gt; Auth Mapping Profile. It must be a SAML type profile.
The profile is given a descriptive name and requires a rule.
I&rsquo;ll keep it simple in this example and say that all users whose SAML attribute contains vcf-avi become super users.
Remember, the SAML attribute contains all Keycloak groups. You could also write a RegEX, because with Contains I have the problem that a group named vcf-avi-test would also match.
But I&rsquo;ll leave that to the people who have to implement it in production.</p>
<figure><a href="05.png"><picture><source srcset="/avi-keycloak/05_hu_82f3fe60328d4ea1.png" type="image/png">
          <img
            src="/avi-keycloak/05_hu_82f3fe60328d4ea1.png"alt="AVI Mapping Profile"width="2802"
            height="1964"/>
        </picture></a><figcaption><p>AVI Mapping Profile (click to enlarge)</p></figcaption></figure>
<p>Finally, the Auth profile and the Mapping Profile just need to be assigned.</p>
<h3 id="avi-authentication-settings">AVI Authentication settings</h3>
<p>You can do this easily under Administration -&gt; System Settings and then edit them.
Under Authentication, you must switch from Local to Remote.
The Enable Local User Login checkbox should be selected, otherwise you will not be able to log in locally if SSO fails.
Finally, map the Auth Profile with the Mapping Profile, and then we&rsquo;re done and can run our first test.</p>
<h3 id="first-test">First Test</h3>
<p>If everything works correctly, after accessing the AVI page, a message should automatically appear stating that you are not logged in, and the Keycloak login screen should appear directly via the login link.</p>
<figure><a href="06.png"><picture><source srcset="/avi-keycloak/06_hu_5a45df03e2ee44b.png" type="image/png">
          <img
            src="/avi-keycloak/06_hu_5a45df03e2ee44b.png"alt="Keycloak SSO"width="3452"
            height="1964"/>
        </picture></a><figcaption><p>Keycloak SSO (click to enlarge)</p></figcaption></figure>
<p>You can log in with either your username or your email address.
Because I was smart enough to use my private email address, you can only see the username here.</p>
<figure><a href="07.png"><picture><source srcset="/avi-keycloak/07_hu_201815e447a24b5.png" type="image/png">
          <img
            src="/avi-keycloak/07_hu_201815e447a24b5.png"alt="AVI SSO"width="3444"
            height="1944"/>
        </picture></a><figcaption><p>AVI User (click to enlarge)</p></figcaption></figure>

    <aside class="admonition attention">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" class="feather feather-link" width="24" height="24" viewBox="0 0 24 24"
      fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
      <path d="M10 13a5 5 0 0 0 7.54.54l3-3a5 5 0 0 0-7.07-7.07l-1.72 1.71"></path>
      <path d="M14 11a5 5 0 0 0-7.54-.54l-3 3a5 5 0 0 0 7.07 7.07l1.71-1.71"></path>
   </svg></div><b>Attention</b>
        </div>
        <div class="admonition-content"><p>You can log in with a local user by using this link</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">https://avi-url/#!/login?local=1 
</span></span></code></pre></div></div>
    </aside>
<h2 id="interim-conclusion">Interim conclusion</h2>
<p>SSO login via Keycloak should work, but unfortunately it is not yet possible to log in with a FIDO key, and there is still no way to prevent the dummy user from logging in to AVI, or rather, from being rejected by AVI, which breaks our SAML cookie.
Time to change that. So grab a coffee, because this article could go on for quite a while.</p>
<h3 id="enable-fido-key">Enable Fido Key</h3>
<p>To use a Yubikey or similar device, a few options need to be set in Keycloak.
Passwordless login is also supported, but I&rsquo;m not personally a fan of this.
However, the activation process is similar. First, in the Realm Settings under Authentication -&gt; Required actions -&gt; Webauth Register, set Enabled and Set as default action.
For passwordless login, the corresponding option Webauthn Register Passwordless must be activated and set as the default action.
Without the default action, users will not be asked to create a passkey when they log in for the first time.</p>

    <aside class="admonition attention">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" class="feather feather-link" width="24" height="24" viewBox="0 0 24 24"
      fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
      <path d="M10 13a5 5 0 0 0 7.54.54l3-3a5 5 0 0 0-7.07-7.07l-1.72 1.71"></path>
      <path d="M14 11a5 5 0 0 0-7.54-.54l-3 3a5 5 0 0 0 7.07 7.07l1.71-1.71"></path>
   </svg></div><b>Attention</b>
        </div>
        <div class="admonition-content">If you have already used users to log in, they must be deleted in Keycloak after you have activated the option.
Thanks to AD sync, the users will be back in no time.</div>
    </aside>
<p>Then, under Authentication -&gt; Policies -&gt; WebAuthn Policy or WebAuthn Passwordless Policy, the policy must be activated. To do this, WebAuthn -&gt; Avoid same authenticator registration should be activated, or, in the case of WebAuthn Passwordless Policy, this option and enable passkeys in general.</p>
<figure><a href="08.png"><picture><source srcset="/avi-keycloak/08_hu_f05a080e01584914.png" type="image/png">
          <img
            src="/avi-keycloak/08_hu_f05a080e01584914.png"alt="Webauthn"width="1719"
            height="1358"/>
        </picture></a><figcaption><p>Webauthn Settings (click to enlarge)</p></figcaption></figure>
<h3 id="flow">Flow</h3>
<p>Finally, a flow must be created under Authentication -&gt; Flows. I use the built-in browser-based authentication flow as a basis and create a copy via the context menu.
In order to use the client role in the flow, the flow must be assigned to the client. To do this, the flow you just created must be assigned under Clients -&gt; AviController-xxx -&gt; Advanced -&gt; Authentication flow overrides. This allows each client to have its own flow and different checks or login methods.</p>
<p>A flow consists of three key elements. There are steps (executions), which are specific actions in the login process, such as deny access.
Secondly, there are conditions, which are ways of checking something, such as whether the user has a certain role, and finally there are sub-flows, which can be used to group steps and conditions.
In addition, each element has a requirement that determines whether the login continues or is terminated.
In short, a flow is a chain of steps, conditions, and sub-flows with rules that control the login process.</p>
<figure><a href="09.png"><picture><source srcset="/avi-keycloak/09_hu_7385dc10fe2aa28e.png" type="image/png">
          <img
            src="/avi-keycloak/09_hu_7385dc10fe2aa28e.png"alt="Flow"width="1721"
            height="1325"/>
        </picture></a><figcaption><p>Flow (click to enlarge)</p></figcaption></figure>
<ul>
<li>
<ol>
<li>To activate keypasses, WebAuthn must be configured in the flow and set to required.</li>
</ol>
</li>
<li>
<ol start="2">
<li>Next, a subflow of type generel must be created. This can be done using the small plus sign. It is important that the subflow is in the correct position at the end and is indented correctly. The subflow is used to check role membership.</li>
</ol>
</li>
<li>
<ol start="3">
<li>The subflow must be dragged and dropped into the correct position. It is sometimes easier to do this by dragging other elements upwards. The subflow must also be set to Conditional. The conditional block is only executed if the previous step is TRUE. In this flow, 2FA must have been successfully completed, otherwise the login will be canceled.</li>
</ol>
</li>
<li>
<ol start="4">
<li>Next, a condition and a step must be created using the plus sign. Both must be defined as required, otherwise the check will not be performed and, in the event of a missing authorization, Access denied will not be executed.</li>
</ol>
</li>
<li>
<ol start="5">
<li>The condition must be of type user role, and client roles must be selected under select role. Then I select the avi-access role and select Negate output. Negate Output ensures that the condition is set to TRUE if the user does not have the avi-access role. If the condition is TRUE, the step deny access is executed. If the user has the role, the condition is FALSE and the flow is terminated with the result that I have access.</li>
</ol>
</li>
</ul>
<p>If everything has been done correctly, the flow should look like the screenshot. &ldquo;Indentation&rdquo; is just as important here as it is in YAML.</p>
<figure><a href="10.png"><picture><source srcset="/avi-keycloak/10_hu_cd8fda5cc35c85be.png" type="image/png">
          <img
            src="/avi-keycloak/10_hu_cd8fda5cc35c85be.png"alt="Flow details"width="1712"
            height="1319"/>
        </picture></a><figcaption><p>Flow details (click to enlarge)</p></figcaption></figure>
<p>Now that everything is configured, it should work roughly as shown in this video.</p>
<div class="video-container">
  <iframe width="560" height="315"
    src="https://www.youtube-nocookie.com/embed/MHD71gCeZMw"
    frameborder="0"
    allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
    allowfullscreen>
  </iframe>
</div>
<h2 id="troubleshooting">Troubleshooting</h2>
<p>Troubleshooting can be very time-consuming, and I had a few problems. I could bet that I logged into AVI at least 300 times in the last two days. So here are a few general tips.</p>
<ul>
<li>Delete your cookies. Even if it is open in an anonymous tab, cookie remnants may remain during runtime, and in case of doubt, this is a broken SAML cookie that then prevents even valid logins.</li>
<li>Always delete the session in Keycloak. Just because you log out of AVI does not mean that the SSO session is also closed.</li>
<li>When dragging and dropping in Flow, elements may automatically deactivate if they are moved to an invalid position, for example. Keycloak attempts to save immediately after moving. This nearly drove me to distraction. Sometimes it makes more sense to move other elements than the element you actually want to move.</li>
<li>Enables logging in Keycloak. This is configured in the Realm Settings under Events.</li>
<li>If parts of the flow were not executed, the position of the sub-flow should be checked to see if it was nested correctly.</li>
<li>If “access denied” is displayed despite correct role assignment, the negation in the condition was probably forgotten.</li>
</ul>
<figure><a href="11.png"><picture><source srcset="/avi-keycloak/11_hu_7e29236e7e54c346.png" type="image/png">
          <img
            src="/avi-keycloak/11_hu_7e29236e7e54c346.png"alt="Sorry"width="2016"
            height="1326"/>
        </picture></a><figcaption><p>We are sorry&hellip;(to be honest, I&rsquo;m not.)</p></figcaption></figure>
<h2 id="summary">Summary</h2>
<p>Keycloak or OIDC is a damn powerful toolkit. However, it is anything but simple and, unfortunately, extremely poorly documented for AVI load balancers. I actually wanted to write down the configuration for the VCF client, but the blog article is becoming so extensive that most people probably won&rsquo;t read it anyway, so I&rsquo;d rather pick up the whole thing again in a second article.</p>
]]></content>
		</item>
		
		<item>
			<title>VCF9 - Automation in All Apps mode</title>
			<link>https://sdn-warrior.org/posts/vcf9-automation-all-apps/</link>
			<pubDate>Mon, 12 Jan 2026 02:00:00 +0100</pubDate>
			
			<guid>https://sdn-warrior.org/posts/vcf9-automation-all-apps/</guid>
			<description><![CDATA[A quick introduction to VCF 9 Automation in All Apps mode]]></description>
			<content type="html"><![CDATA[<h2 id="introduction">Introduction</h2>
<p>Ah Sh*t, Here We Go Again - Happy New Year! What was I thinking, letting you guys vote on what my next topic would be? But okay, I guess I only have myself to blame.
Anyway, over 60% wanted me to cover VCF9 automation in All Apps mode.
As most people know, there are two modes in VCF Automation: the so-called legacy mode, which is nothing more than VCF Automation 8.18, and the new and much more exciting All Apps mode.</p>
<p>The differences in brief: Legacy Mode follows a classic IaaS approach, focusing on the provision of individual VMs.
All Apps Mode is application-centric. The aim is to deploy an application as a whole, with VMs, containers, network, and storage being merely consumable resources.
VMs are consumed as a service in a “Kubernetes-native” manner.
This is why VKS and NSX (with VPCs) are also mandatory components of the automation solution in VCF9.</p>
<h2 id="prerequisites">Prerequisites</h2>
<p>But okay, let&rsquo;s get to the prerequisites for using automation:
We need a VCF9 deployment. In my lab, I have deployed a management design (formerly consolidated design) with two vSphere clusters.
The second cluster is not necessary, even though the current VCF 9 documentation claims otherwise.</p>
<p>NSX with a T0 router in active/standby mode is required (VPCs with stateful services such as auto snat still need this, as in VCF9.0.1 the transit gateway inherits the HA mode from the T0) and a VKS supervisor cluster.</p>
<p>Phew, that&rsquo;s quite a lot.
The article will not cover VCF9 deployment and NSX VPC setup.
You can find the basics of VPCs in my blog <a href="https://sdn-warrior.org/posts/vcf9-nsx-vpc/">here</a>, <a href="https://sdn-warrior.org/posts/vcf9-nsx-vpc-part2/">here</a>, and <a href="https://sdn-warrior.org/posts/vcf9-nsx-vpc-part3/">here</a>.</p>
<p>But here I will show you VCF Automation Deployment, a basic VKS deployment (with VPCs), setting up an organization in Automation, and creating a VM in All Apps Mode.</p>
<h2 id="vcf-automation-deployment">VCF Automation Deployment</h2>
<p>If automation was not provided via the VCF Installer, it can be deployed retrospectively as a Day 2 operation via VCF Operations.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">Automation initially requires 24 vCPUs and 96 GB of RAM. In my experience, the vCPUs can be changed to 16 after deployment. The RAM must not be reduced. VCF Automation is based internally on Kubernetes, and this cannot start properly with less RAM.</div>
    </aside>
<p>To deploy automation in small form factor, we need 1 FQDN/IP that serves as VIP and 2 additional IP addresses that are used as cluster node IP pool.
I use the internal load balancer in my setup.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">If you are using the internal load balancer, the FQDN of the cluster VIP must match the automation endpoint. In short, the automation appliance FQDN is the same as that of the cluster VIP.</div>
    </aside>
<p>The actual setup is fairly straightforward. The component is installed via Operations Fleet Manager under Lifecycle.</p>
<figure><a href="01.png"><picture><source srcset="/vcf9-auto/01_hu_d0dc34bedd45e889.png" type="image/png">
          <img
            src="/vcf9-auto/01_hu_d0dc34bedd45e889.png"alt="Auto deployment"width="1718"
            height="1321"/>
        </picture></a><figcaption><p>Automation Deployment (click to enlarge)</p></figcaption></figure>
<p>I won&rsquo;t go through every single step here, just the most important ones. When creating the certificate, I always create a wildcard certificate in the lab.
It is important that the VIP IP address and the FQDN are entered in the SANs. In my lab, the SAN entries would be *.lab.vcf, 10.28.13.12.</p>
<p>The network settings are also relatively self-explanatory. I deploy my automation to the VM MGMT Network, which was automatically created by VCF. I use the servers specified during the VCF installation as DNS and NTP servers.
You don&rsquo;t have to enter these again separately; the Edit button gives you a selection of all DNS and NTP servers known to VCF. This is somewhat counterintuitive.</p>
<figure><a href="02.png"><picture><source srcset="/vcf9-auto/02_hu_78603c1122abc3ca.png" type="image/png">
          <img
            src="/vcf9-auto/02_hu_78603c1122abc3ca.png"alt="Auto network"width="1435"
            height="1262"/>
        </picture></a><figcaption><p>Automation Network settings (click to enlarge)</p></figcaption></figure>
<p>Now comes the component part, where many (like me) are probably a little confused at first.
I don&rsquo;t find the description here very clear. Since the internal load balancer is used, the FQDN of the component and the cluster VIP must be identical.</p>
<figure><a href="03.png"><picture><source srcset="/vcf9-auto/03_hu_dea63aeda2deaab0.png" type="image/png">
          <img
            src="/vcf9-auto/03_hu_dea63aeda2deaab0.png"alt="Auto components"width="1426"
            height="1246"/>
        </picture></a><figcaption><p>Automation Comnponents (click to enlarge)</p></figcaption></figure>
<p>The IP address created for the FQDN is entered under Primary VIP.
The Internal Cluster CIDR is for internal communication and is 198.18.0.0/15 by default and should not overlap with existing networks.
The range is from the reserved network and is not usually used in any network. The Cluster CIDR cannot be changed afterwards.</p>
<p>Additional VIPs are not required in the standard setup and can be left blank.
The two additional IP addresses I mentioned at the beginning are entered in the Cluster Node IP Pool.
These are also from the VM MGMT Lan. Each automation node is assigned an IP address from this pool.
A spare IP is required for upgrades, so the minimum size of the pool is always the number of automation nodes + 1.</p>
<p>If everything has been done correctly, the precheck will run without any problems.
The deployment takes some time. With my hardware, it takes just over an hour, but I have seen deployments that take up to 3 hours.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">If you have more than one VCF instance in Operations and the SDDC Manager of this additional instance is not reachable, the deployment process in step 14 will fail with a message stating that the SDDC Manager is not reachable. This error can be bypassed by selecting “Retry and skip,” but VCF Automation will then not recognize any VCF instances. At least the SDDC Manager of all instances must be reachable at the time of deployment.</div>
    </aside>
<p>After one or two cups of VCF Admin fuel, aka coffee, the installation should now be successfully complete and we can focus on the VKS deployment.</p>
<h2 id="vks-deployment-with-vpcs">VKS Deployment with VPCs</h2>
<p>First of all, VKS is not my area of expertise, so I&rsquo;m going to do the simplest VKS deployment possible here.
I&rsquo;m using the NSX load balancer instead of AVI, and I&rsquo;m using VPCs. I also use the simple installation and not the HA cluster of the supervisor.
If you want to read more about VKS, I recommend my colleague <a href="https://vi-universe.github.io/">Christian</a>, who does interesting things with Antrea, VPCs, and VKS.</p>
<p>To install VKS, you must first create a storage policy. To do this, you must first create a tag and a tag category.
The quickest way to do this is via vCenter by clicking on Storage and selecting Assign Tag.
Select Add Tag and then Create New Category to create a datastore tag category.</p>
<figure><a href="04.png"><picture><source srcset="/vcf9-auto/04_hu_90dbd8dd16e6dfde.png" type="image/png">
          <img
            src="/vcf9-auto/04_hu_90dbd8dd16e6dfde.png"alt="TAG Category"width="1734"
            height="1208"/>
        </picture></a><figcaption><p>TAG Category (click to enlarge)</p></figcaption></figure>
<p>It is important to select One Tag and Datastore here. Once the category has been created, a tag can be created and assigned to the storage.</p>
<p>Then I create the appropriate storage policy under Policies and Profiles -&gt; VM Storage Policies. This is necessary to ensure that the correct storage is selected during VKS deployment. The storage policy is TAG-based and uses the previously created TAG.</p>
<figure><a href="05.png"><picture><source srcset="/vcf9-auto/05_hu_c157549bb0b81d6a.png" type="image/png">
          <img
            src="/vcf9-auto/05_hu_c157549bb0b81d6a.png"alt="Storage Policy"width="2894"
            height="1654"/>
        </picture></a><figcaption><p>Storage Policy (click to enlarge)</p></figcaption></figure>
<p>Next, we start the VKS deployment via the vCenter Burger Menu and the Supervisor Management menu item.
During deployment, a content library is automatically created on the VKS Storage Policy data store.
If you want, you can also do this in advance, but since it should be as simple as possible, I use the automatically generated content library.</p>
<figure><a href="06.png"><picture><source srcset="/vcf9-auto/06_hu_4899aeb32cfe3f0f.png" type="image/png">
          <img
            src="/vcf9-auto/06_hu_4899aeb32cfe3f0f.png"alt="VKS deployment"width="3454"
            height="1864"/>
        </picture></a><figcaption><p>VKS deployment (click to enlarge)</p></figcaption></figure>
<p>In my opinion, deployment via VPCs is the simplest method, as it requires the fewest parameters to be set during deployment.
A new VPC must be created in advance for the supervisor cluster and a <em><strong>public VPC subnet</strong></em> for management.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">The management network can also be a standard port group such as the VM management network. It does not have to be a VPC subnet, but if it is a VPC subnet, it must be public. Unfortunately, VKS deployment does not have any prechecks, which means that it does not check whether the selected management network is accessible, for example. If in doubt, this means deactivating deployment (a funny name for deleting) and starting from scratch.</div>
    </aside>
<p>Next, the supervisor location is determined. Since this is a normal cluster deployment and not a vSphere with Zones setup, Cluster Deployment must be selected.
In addition, a supervisor name is assigned (in lowercase letters) and, since this is to be a single node cluster, Control Plane HA remains disabled. As mentioned, I have two vSphere clusters in my setup and I want my supervisor to be in cluster m02-cl02.</p>
<figure><a href="07.png"><picture><source srcset="/vcf9-auto/07_hu_3460e0807abf82c0.png" type="image/png">
           <img
             src="/vcf9-auto/07_hu_3460e0807abf82c0.png"alt="VKS Supervisor location"width="3444"
             height="1856"/>
         </picture></a><figcaption><p>VKS Supervisor location (click to enlarge)</p></figcaption></figure>
<p>After that, the storage policy is assigned and you can proceed to the management network setup.
Normally, a pool of at least 4 IP addresses (1 VIP, 3 supervisor nodes) is sufficient. In a simple deployment, even fewer IP addresses are required, but I have specified a larger range to enable scale-out in the future. The remaining details are fairly self-explanatory.</p>
<figure><a href="08.png"><picture><source srcset="/vcf9-auto/08_hu_3740fd3c6d655f65.png" type="image/png">
           <img
             src="/vcf9-auto/08_hu_3740fd3c6d655f65.png"alt="VKS Supervisor Managment Network"width="3438"
             height="1844"/>
         </picture></a><figcaption><p>VKS Supervisor Managment Network (click to enlarge)</p></figcaption></figure>
<p>The next setting is the workload network, and the nice thing about the VPC setup is that most of it is already pre-filled and was specified during the NSX setup of the VPCs.
Since I want to keep this example simple, I am using the default project from NSX.
It would of course also be possible to put the supervisor in its own NSX project with its own T0.</p>
<p>The private (VPC) CIDRs can be freely assigned, as it is a non-routable network, so overlapping IP ranges are possible and permissible.
Nevertheless, you should be careful not to enter any of your routable networks as private CIDRs.
The service CIDR is preselected and usually does not need to be adjusted, and again comes from a privately reserved range.</p>
<p>DNS and NTP can be customized for the workers here, but this is not necessary in my case.
These are addressed from the VPC with Auto SNAT, as a worker node cannot communicate directly via the private VPC CIDRs.</p>
<figure><a href="09.png"><picture><source srcset="/vcf9-auto/09_hu_92fba01630445f01.png" type="image/png">
           <img
             src="/vcf9-auto/09_hu_92fba01630445f01.png"alt="VKS Supervisor Workload Network"width="1718"
             height="1261"/>
         </picture></a><figcaption><p>VKS Supervisor Workload Network (click to enlarge)</p></figcaption></figure>
<p>Finally, the supervisor control plane size is determined. I am using Small (4 vCPUs and 16 GB RAM per supervisor control node) here.</p>
<figure><a href="10.png"><picture><source srcset="/vcf9-auto/10_hu_92617ea48af0430.png" type="image/png">
           <img
             src="/vcf9-auto/10_hu_92617ea48af0430.png"alt="VKS Supervisor Control Plane"width="1721"
             height="1317"/>
         </picture></a><figcaption><p>VKS Supervisor Control Plane (click to enlarge)</p></figcaption></figure>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">Depending on the hardware, the small form factor may be too small, and deployment will either be extremely slow or even result in errors. Especially on weak nucs, it is recommended to select medium. The NSX edge VMs should also be deployed with medium if you want to use the NSX load balancer. If the deployment runs into errors despite a correct network, the hardware sizing may not be sufficient—especially if you have a CPU with E and P cores.</div>
    </aside>
<figure><a href="11.png"><picture><source srcset="/vcf9-auto/11_hu_f3529338d3b3276a.png" type="image/png">
          <img
            src="/vcf9-auto/11_hu_f3529338d3b3276a.png"alt="VKS Supervisor"width="3438"
            height="1844"/>
        </picture></a><figcaption><p>VKS Supervisor (click to enlarge)</p></figcaption></figure>
<p>The VKS setup takes quite a bit of time. I think it took me at least 1 to 2 coffees—so a good 20 minutes.
Yes, I should probably reconsider my coffee consumption.
And while we&rsquo;re on the subject of consumption – what a transition – the LCI, or Local Consumption Interface, still needs to be installed.</p>
<p>The LCI can be found in the Broadcom Support Portal under the Free Software category. It is a simple YAML file. If you don&rsquo;t want to search for it or deal with the Broadcom portal, you can copy version 9.0.2 (the current version at the time of publication) here.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">data.packaging.carvel.dev/v1alpha1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">Package</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">cci-ns.vmware.com.9.0.2+f943fb89</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">refName</span><span class="p">:</span><span class="w"> </span><span class="l">cci-ns.vmware.com</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">version</span><span class="p">:</span><span class="w"> </span><span class="m">9.0.2</span><span class="l">+f943fb89</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">template</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">fetch</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="nt">imgpkgBundle</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">image</span><span class="p">:</span><span class="w"> </span><span class="l">projects.packages.broadcom.com/vsphere/iaas/lci-service/9.0.2/lci-service:9.0.2-f943fb89</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">template</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="nt">ytt</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">paths</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span>- <span class="l">config/</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">ignoreUnknownComments</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="nt">kbld</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">paths</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span>- <span class="s1">&#39;-&#39;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span>- <span class="l">.imgpkg/images.yml</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">deploy</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="nt">kapp</span><span class="p">:</span><span class="w"> </span>{}<span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">valuesSchema</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">openAPIv3</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">type</span><span class="p">:</span><span class="w"> </span><span class="l">object</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">additionalProperties</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">description</span><span class="p">:</span><span class="w"> </span><span class="l">OpenAPIv3 Schema for Consumption Interface</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">properties</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">namespace</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">type</span><span class="p">:</span><span class="w"> </span><span class="l">string</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">description</span><span class="p">:</span><span class="w"> </span><span class="l">Namespace of the component</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">default</span><span class="p">:</span><span class="w"> </span><span class="l">cci-svc</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">podVMSupported</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">type</span><span class="p">:</span><span class="w"> </span><span class="l">boolean</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">description</span><span class="p">:</span><span class="w"> </span><span class="l">This field indicates whether PodVMs are supported on the environment</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">default</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">virtualIP</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">type</span><span class="p">:</span><span class="w"> </span><span class="l">string</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">description</span><span class="p">:</span><span class="w"> </span><span class="l">IP address of the Kubernetes LoadBalancer type service fronting the apiservers.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">default</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nn">---</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">data.packaging.carvel.dev/v1alpha1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">PackageMetadata</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">cci-ns.vmware.com</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">displayName</span><span class="p">:</span><span class="w"> </span><span class="l">Local Consumption Interface</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">shortDescription</span><span class="p">:</span><span class="w"> </span><span class="l">Provides the Local Consumption Interface for Namespaces within vSphere Client.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">longDescription</span><span class="p">:</span><span class="w"> </span><span class="l">Provides the Local Consumption Interface for Namespaces within vSphere Client.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">providerName</span><span class="p">:</span><span class="w"> </span><span class="l">VMware</span><span class="w">
</span></span></span></code></pre></div><p>The LCI in VKS provides vSphere resources (VM Service) locally via the Kubernetes Supervisor.
It makes virtual machines “consumable” for Kubernetes, as if they were native K8s objects. Automation uses the LCI for the deployment of VMs.</p>
<p>To install the LCI, it must first be created as a supervisor service. To do this, go to the vCenter burger menu and select Supervisor Management -&gt; Namespaces -&gt; Services -&gt; Add.</p>
<figure><a href="12.png"><picture><source srcset="/vcf9-auto/12_hu_d7025cb5efd851d6.png" type="image/png">
          <img
            src="/vcf9-auto/12_hu_d7025cb5efd851d6.png"alt="LCI"width="3438"
            height="1844"/>
        </picture></a><figcaption><p>LCI (click to enlarge)</p></figcaption></figure>
<p>After successful registration, another tile should now appear under Services, and we can roll out the service via Local Consumption Interface -&gt; Actions -&gt; Manage Service. The Supervisor Cluster must be selected, and then the dialog must be confirmed with Next until the end. No additional information is required. If everything has worked, the service is active on the Supervisor. This can be checked via Supervisor Management.</p>
<figure><a href="13.png"><picture><source srcset="/vcf9-auto/13_hu_8fd2dc2f9b8017d8.png" type="image/png">
          <img
            src="/vcf9-auto/13_hu_8fd2dc2f9b8017d8.png"alt="LCI status"width="1706"
            height="1092"/>
        </picture></a><figcaption><p>LCI Status (click to enlarge)</p></figcaption></figure>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">My experience with VKS in the lab is that you need to be patient. During deployment, you will  occasionally see errors, especially if the content library is still loading images in the background or the supervisor cluster has not yet fully booted up. This is normal at this stage. However, if it does not progress for a longer period of time, there are various debugging options.
The simplest test is to see if you can ping the supervisor via vCenter. If that doesn&rsquo;t work, then there is probably something wrong with the NSX/VPC setup, or a private subnet was used as the management network.</div>
    </aside>
<h3 id="troubleshooting-vks">Troubleshooting VKS</h3>
<p>Unfortunately, you cannot simply log in to the supervisor. Some of you may have noticed that we do not assign a single password. To access the supervisor&rsquo;s root account, you must log in to vCenter via SSH and switch to the root shell.
Then you run a Python script and get the supervisor&rsquo;s password.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">ssh root@vcf09-vcsa.lab.vcf 
</span></span><span class="line"><span class="cl">shell
</span></span><span class="line"><span class="cl">/usr/lib/vmware-wcp/decryptK8Pwd.py
</span></span></code></pre></div><p>The output should look something like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">root@vcf09-vcsa [ ~ ]# /usr/lib/vmware-wcp/decryptK8Pwd.py
</span></span><span class="line"><span class="cl">Read key from file
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">Connected to PSQL
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">Cluster: domain-c12002:8f68f1b2-a1ac-4fce-a9bf-26c7492b2fcb
</span></span><span class="line"><span class="cl">IP: 10.29.0.19
</span></span><span class="line"><span class="cl">PWD: xxxxxxxxxxxxxxxxx
</span></span></code></pre></div><p>Since the supervisor is in a public VPC network, we should now be able to access the supervisor via SSH. This can be done from the vCenter or directly.</p>
<p>You can display the status of the pods using <em><strong>kubectl get pods -A</strong></em>. These should be in running or completed status.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">root@42022d56cee15501c2f40bcedfcb23b5 [ ~ ]# kubectl get pods -A
</span></span><span class="line"><span class="cl">NAMESPACE                                   NAME                                                              READY   STATUS      RESTARTS         AGE
</span></span><span class="line"><span class="cl">kube-state-metrics-domain-c12002            kube-state-metrics-5df7f5548d-jr68x                               2/2     Running     2 (4h20m ago)    30h
</span></span><span class="line"><span class="cl">kube-system                                 cns-storage-quota-extension-8fd45b984-2p4m6                       1/1     Running     1 (4h20m ago)    30h
</span></span><span class="line"><span class="cl">kube-system                                 coredns-86688d758-2j2gr                                           1/1     Running     9 (4h20m ago)    30h
</span></span><span class="line"><span class="cl">kube-system                                 docker-registry-42022d56cee15501c2f40bcedfcb23b5                  1/1     Running     1 (4h20m ago)    30h
</span></span><span class="line"><span class="cl">kube-system                                 etcd-42022d56cee15501c2f40bcedfcb23b5                             1/1     Running     1 (4h20m ago)    30h
</span></span><span class="line"><span class="cl">kube-system                                 kube-apiserver-42022d56cee15501c2f40bcedfcb23b5                   1/1     Running     3 (4h19m ago)    30h
</span></span><span class="line"><span class="cl">kube-system                                 kube-controller-manager-42022d56cee15501c2f40bcedfcb23b5          1/1     Running     3 (4h19m ago)    30h
</span></span><span class="line"><span class="cl">kube-system                                 kube-proxy-rdk2p                                                  1/1     Running     1 (4h20m ago)    30h
</span></span><span class="line"><span class="cl">kube-system                                 kube-scheduler-42022d56cee15501c2f40bcedfcb23b5                   2/2     Running     5 (4h19m ago)    30h
</span></span><span class="line"><span class="cl">kube-system                                 kubectl-plugin-vsphere-42022d56cee15501c2f40bcedfcb23b5           1/1     Running     5 (4h19m ago)    30h
</span></span><span class="line"><span class="cl">kube-system                                 snapshot-controller-576bd689df-ff7jg                              1/1     Running     1 (4h20m ago)    30h
</span></span><span class="line"><span class="cl">kube-system                                 storage-quota-webhook-6b8c9c45b9-c2pwx                            1/1     Running     5 (4h19m ago)    30h
</span></span><span class="line"><span class="cl">kube-system                                 supervisor-authz-service-controller-manager-9c666c964-4g4t6       1/1     Running     1 (4h20m ago)    30h
</span></span><span class="line"><span class="cl">kube-system                                 wcp-authproxy-42022d56cee15501c2f40bcedfcb23b5                    1/1     Running     2 (4h19m ago)    30h
</span></span><span class="line"><span class="cl">kube-system                                 wcp-fip-42022d56cee15501c2f40bcedfcb23b5                          1/1     Running     1 (4h20m ago)    30h
</span></span><span class="line"><span class="cl">...
</span></span><span class="line"><span class="cl">vmware-system-zoneop                        zone-operator-58bc66b7bb-wvpjf                                    1/1     Running     1 (4h20m ago)    30h
</span></span></code></pre></div><p>With <em><strong>kubectl get nodes</strong></em>, you can see whether the control plane and the agents on the ESX servers are ready.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">root@42022d56cee15501c2f40bcedfcb23b5 [ ~ ]# kubectl get nodes
</span></span><span class="line"><span class="cl">NAME                               STATUS   ROLES                  AGE   VERSION
</span></span><span class="line"><span class="cl">42022d56cee15501c2f40bcedfcb23b5   Ready    control-plane,master   30h   v1.31.6+vmware.3-fips
</span></span><span class="line"><span class="cl">vcf09-m02-esx01.lab.vcf            Ready    agent                  30h   v1.31.6-sph-vmware-clustered-infravisor-trunk-85-g71ed1bf
</span></span><span class="line"><span class="cl">vcf09-m02-esx02.lab.vcf            Ready    agent                  30h   v1.31.6-sph-vmware-clustered-infravisor-trunk-85-g71ed1bf
</span></span></code></pre></div><p>To obtain a log file for a single pod, you can use the following command:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">kubectl logs -n namespace -p POD/name 
</span></span></code></pre></div><p>An output from the Coredns Pod looks like this, for example:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">root@42022d56cee15501c2f40bcedfcb23b5 [ ~ ]# kubectl logs -n kube-system -p POD/coredns-86688d758-2j2gr  
</span></span><span class="line"><span class="cl">.:53 on 172.26.0.3
</span></span><span class="line"><span class="cl">[INFO] plugin/reload: Running configuration SHA512 = 0e63e439c10056457d08907176fbd8310056b76cf036ff455749731bc2e3eac256b51dfe7d2da9cddace06530a8353dbf6f9814cf475b3267f0f664f6d6b4241
</span></span><span class="line"><span class="cl">CoreDNS-1.11.3
</span></span><span class="line"><span class="cl">linux/amd64, go1.21.8 X:boringcrypto, v1.11.3+vmware.8-fips
</span></span><span class="line"><span class="cl">[INFO] 172.26.0.3:39797 - 26176 &#34;HINFO IN 234899636062536918.376157124480636184. udp 55 false 512&#34; NXDOMAIN qr,rd,ra 130 0.013725101s
</span></span><span class="line"><span class="cl">[INFO] 172.26.0.3:55311 - 31871 &#34;AAAA IN 42022d56cee15501c2f40bcedfcb23b5.vmware-system-monitoring.svc.cluster.local. udp 93 false 512&#34; NXDOMAIN qr,aa,rd 186 0.000161466s
</span></span><span class="line"><span class="cl">[INFO] 172.26.0.3:49051 - 45168 &#34;A IN 42022d56cee15501c2f40bcedfcb23b5.vmware-system-monitoring.svc.cluster.local. udp 93 false 512&#34; NXDOMAIN qr,aa,rd 186 0.0000998s
</span></span><span class="line"><span class="cl">[INFO] 172.26.0.3:37736 - 62871 &#34;AAAA IN 42022d56cee15501c2f40bcedfcb23b5.svc.cluster.local. udp 68 false 512&#34; NXDOMAIN qr,aa,rd 161 0.000090623s
</span></span><span class="line"><span class="cl">[INFO] 172.26.0.3:59085 - 42593 &#34;A IN 42022d56cee15501c2f40bcedfcb23b5.svc.cluster.local. udp 68 false 512&#34; NXDOMAIN qr,aa,rd 161 0.000124409s
</span></span><span class="line"><span class="cl">[INFO] 172.26.0.3:47017 - 52562 &#34;AAAA IN 42022d56cee15501c2f40bcedfcb23b5.cluster.local. udp 64 false 512&#34; NXDOMAIN qr,aa,rd 157 0.00005269s
</span></span><span class="line"><span class="cl">[INFO] 172.26.0.3:56547 - 58782 &#34;A IN 42022d56cee15501c2f40bcedfcb23b5.cluster.local. udp 64 false 512&#34; NXDOMAIN qr,aa,rd 157 0.000023421s
</span></span><span class="line"><span class="cl">...
</span></span></code></pre></div><p>If none of this helps, then you have to sacrifice your firstborn to the Kubernetes gods. Experience in the lab has shown that either the network is set up incorrectly or the control plane is too small.</p>
<p>Finally, we have made all the preparations to configure our automation.</p>
<h2 id="setup-automation---my-first-tenant">Setup Automation - my first tenant</h2>
<p>After redeploying my VKS three times due to some typos, I can now log in to the automation GUI.
The organization name to log in to Provider Management is <em><strong>system</strong></em>. The username is admin and the password is the one you set during deployment.
If you want, you can replace the TLS certificate with your own via VCF Operations. However, this is not a must.</p>
<figure><a href="14.png"><picture><source srcset="/vcf9-auto/14_hu_eb7d135b20c786ce.png" type="image/png">
          <img
            src="/vcf9-auto/14_hu_eb7d135b20c786ce.png"alt="Automation"width="1718"
            height="1320"/>
        </picture></a><figcaption><p>Hello Automation (click to enlarge)</p></figcaption></figure>
<p>If the Automation start screen looks different after logging in to the provider, the VKS cluster was probably not recognized.
An inventory sync can help here.
I use Quick Start for setup because it creates a complete organization with useful settings in a very simple way. An organization is the top level in automation and could also be described as a customer or tenant. An organization is implemented as an NSX project, which ensures separation at the network level.
Resources, cost control, and policies always apply across the entire organization. But more on that later.</p>
<p>Using the Quick Start Wizard, the first thing I have to do is choose the name of my organization. In my setup, I name the organization SDN Warrior.
Next, I have to create a region and assign at least one VKS Supervisor cluster. A region consists of one or more supervisors and provides organizations with a collection of compute, memory, storage, and networking resources.
It can extend across multiple vCenter instances. An organization can contain one or more regions.</p>
<figure><a href="15.png"><picture><source srcset="/vcf9-auto/15_hu_ad9d66481c352dae.png" type="image/png">
          <img
            src="/vcf9-auto/15_hu_ad9d66481c352dae.png"alt="Automation Region"width="1719"
            height="1320"/>
        </picture></a><figcaption><p>Automation Region (click to enlarge)</p></figcaption></figure>
<p>Next, a storage policy must be selected. Here, I select my VKS policy. This can be adjusted later under Region Quota.
That was it, basically. The rest is created completely automatically.</p>
<figure><a href="16.png"><picture><source srcset="/vcf9-auto/16_hu_dd4259123d1e4116.png" type="image/png">
          <img
            src="/vcf9-auto/16_hu_dd4259123d1e4116.png"alt="Automation Summary"width="1720"
            height="1318"/>
        </picture></a><figcaption><p>Automation Summary (click to enlarge)</p></figcaption></figure>
<p>Pretty simple. But we&rsquo;re not quite done yet, and we should first go through the settings that were created automatically.</p>
<p>If we now click on Organization in the Automation menu, we get a good overview of what has been set up, and it quickly becomes apparent that Quickstart has set up significantly more than we had to configure.
A region quota has been configured, which we can use to configure the permitted VM classes.
By default, all default classes are permitted. These VM classes are our T-shirt sizes.
The existing VM classes can be customized in vCenter under Supervisor Management -&gt; Services.
It takes some time before newly created VM classes can be selected in the Quota.</p>
<figure><a href="17.png"><picture><source srcset="/vcf9-auto/17_hu_335978e0b836bd45.png" type="image/png">
          <img
            src="/vcf9-auto/17_hu_335978e0b836bd45.png"alt="Quaota"width="1718"
            height="1111"/>
        </picture></a><figcaption><p>Quota with custom VM Class (click to enlarge)</p></figcaption></figure>
<p>Under Networking in the organization, you can see that the VPC default settings have been used. If there are multiple edge clusters, a specific edge cluster can be selected. In the VPC Connectivity Profile, you can override the ingress and egress QoS settings.
By default, these are unlimited. IP ranges cannot be configured in the Quota, but the public VPC network is used by default. However, these IP ranges are only used for external communication. IP ranges within the organization are managed by the organization itself and not by the provider.</p>
<p>By far the most important setting for our lab is the first user. Currently, no user is stored here. We should change that now.
Only one user can be created; for everything else, a single sign-on solution is required, but that is worth its own blog article.
I give my first user all rights. This should be the break class account. Normally, you will work with users from an identity source.
But who cares about security in the lab?</p>
<p>After creating a user for the Organization Portal, we take a quick look at the network settings.
Under Networking -&gt; General -&gt; IP Blocks, we can customize or expand the public IP block.
IP blocks represent IPs used both inside and outside this local data center, north and south of the provider gateway.
IPs within this scope are used for configuring services and networks.</p>
<figure><a href="18.png"><picture><source srcset="/vcf9-auto/18_hu_ffe6d42d6d7bf8d2.png" type="image/png">
          <img
            src="/vcf9-auto/18_hu_ffe6d42d6d7bf8d2.png"alt="Network Settings"width="1720"
            height="1109"/>
        </picture></a><figcaption><p>Network settings (click to enlarge)</p></figcaption></figure>
<p>Under Regions, you can customize the storage class or add more. Going through all the settings in detail would go beyond the scope of this blog, and we want to get results first.
But I have at least highlighted the most important settings.</p>
<h2 id="organization-portal">Organization Portal</h2>
<p>Everything else is done in the Organization Portal.  To do this, go to Organizations, select the organization, and click Launch Organization Portal.
Alternatively, we log out of the provider portal, enter the name of our organization as the organization, and log in with the user we created earlier.</p>
<figure><a href="19.png"><picture><source srcset="/vcf9-auto/19_hu_5e64fadf272305d7.png" type="image/png">
          <img
            src="/vcf9-auto/19_hu_5e64fadf272305d7.png"alt="Org Portal"width="1720"
            height="1300"/>
        </picture></a><figcaption><p>Organization Portal (click to enlarge)</p></figcaption></figure>
<h3 id="project">Project</h3>
<p>We see that a project has already been created by default. This is done automatically when creating an organization.
We edit this by clicking on Projects and then on the project name.
A brief word about what a project is. A project connects users with the resources they are entitled to and the restrictions they are subject to.
A project typically defines an application development team, its users, how much and what type of infrastructure they can use, and which catalog items they can access.</p>
<p>I am renaming my project to a-team. All projects must be written in lowercase.
I save the project before creating the vSphere namespace.
Then I edit the project again and create a namespace. This refers to a vSphere namespace from VKS.
It consists of one of the predefined namespace classes and can be customized under Manage &amp; Govern if necessary.
In my case, small (best effort) is sufficient, which means that my project will not be allocated more than 10 GB of RAM and 10 GHz of CPU without reservations. Namespace classes can be used to manage resources effectively.
In addition, the region, VPC, and zone must be specified. This is not difficult in my lab, as there is only one of each.</p>
<figure><a href="20.png"><picture><source srcset="/vcf9-auto/20_hu_16ef17d5f78566b7.png" type="image/png">
          <img
            src="/vcf9-auto/20_hu_16ef17d5f78566b7.png"alt="Project"width="1715"
            height="1321"/>
        </picture></a><figcaption><p>Project (click to enlarge)</p></figcaption></figure>
<h3 id="dhcp-and-dns">DHCP and DNS</h3>
<p>Now we have almost everything we need to deploy a VM. However, I want to use DHCP in my lab, so I need to make a small adjustment in NSX.
By default, DHCP can be used with automation, but the VPC service profile does not have a DNS server stored. This can be changed quickly. To do this, I log in to NSX and switch to the SDN-Warrior project (click on Default in the upper left corner of NSX to switch projects).
The profile can be found under VPCs -&gt; Profiles -&gt; VPC Service Profile -&gt; Default VPC Service Profile. Here, I add my DNS server to the profile.</p>
<figure><a href="21.png"><picture><source srcset="/vcf9-auto/21_hu_f98d2ab9fd4beb97.png" type="image/png">
          <img
            src="/vcf9-auto/21_hu_f98d2ab9fd4beb97.png"alt="NSX"width="1712"
            height="894"/>
        </picture></a><figcaption><p>NSX VPC Profile (click to enlarge)</p></figcaption></figure>
<p>Once that has been changed, I create a new subnet set in the Organization Portal. Under Build &amp; Deploy -&gt; Services -&gt; Network -&gt; vcf09-management-domain-default-vpc -&gt; SubnetSets, I create a new subnet set with the name vm and set the DHCP server.
The subnet set is, so to speak, my blueprint for how I use my network. It determines the type of network the VM will be in (in this case, a private network) and how an IP address will be assigned.</p>
<figure><a href="22.png"><picture><source srcset="/vcf9-auto/22_hu_8dc6b21ba26a13ec.png" type="image/png">
          <img
            src="/vcf9-auto/22_hu_8dc6b21ba26a13ec.png"alt="Subnet Set"width="1715"
            height="1226"/>
        </picture></a><figcaption><p>Subnet Set (click to enlarge)</p></figcaption></figure>
<h3 id="content-library">Content library</h3>
<p>We also need a content library. I create this via Build &amp; Deploy -&gt; Content Libraries. The region must be specified and the storage policy selected. I can then upload my OVA template from Alpine Linux to the newly created content library.
It is important to note that the space required is deducted from the configured quota created in the provider setup. You don&rsquo;t have to use Alpine Linux here; any other VM template you have created previously will also work. I use Alpine because of its small size and because it is super easy to configure.</p>
<h3 id="blueprint-design">Blueprint Design</h3>
<p>Now that we have a basic setup, a customized network, and an image in our content library, we can write the blueprint.
A blueprint in VCF Automation is a standardized description of how a VM, service, or application should be automatically provisioned.
To do this, we click on Build &amp; Deploy -&gt; Blueprint Design -&gt; New from Blank Canvas.
The blueprint needs a name, in my case AlpineLinux, and it must be assigned to a project.</p>
<p>The Blueprint designer appears, and the developer now has to familiarize themselves with YAML.
To deploy a simple VM, the namespace must first be specified. To do this, we need a resource of type CCI.Supervisor.Namespace.
Then another resource is needed for the actual VM. This is of type CCI.Supervisor.Resource.
You can drag and drop the two resources into the canvas or type the YAML directly.
I have a relatively simple YAML here that creates an Alpine Linux from a VM template.
However, it can also be used with other VM templates.
There is also a simple input form so that you can specify the VM name.
This must be rfc dns compliant, and a simple regex is also included. The blueprint needs to be slightly modified so that it can be used.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-YAML" data-lang="YAML"><span class="line"><span class="cl"><span class="nt">formatVersion</span><span class="p">:</span><span class="w"> </span><span class="m">1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">inputs</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">textField_6e3db194</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">type</span><span class="p">:</span><span class="w"> </span><span class="l">string</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">title</span><span class="p">:</span><span class="w"> </span><span class="l">VM Name</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">description</span><span class="p">:</span><span class="w"> </span><span class="l">DNS-kompatibler Name (lowercase, RFC-konform)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">pattern</span><span class="p">:</span><span class="w"> </span><span class="l">^[a-z0-9]([a-z0-9-]{0,61}[a-z0-9])?$</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">resources</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">CCI_Supervisor_Namespace_1</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">type</span><span class="p">:</span><span class="w"> </span><span class="l">CCI.Supervisor.Namespace</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">properties</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">a-team-space-8sjpd</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">existing</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">Virtual_Machine_1</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">type</span><span class="p">:</span><span class="w"> </span><span class="l">CCI.Supervisor.Resource</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">properties</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">context</span><span class="p">:</span><span class="w"> </span><span class="l">${resource.CCI_Supervisor_Namespace_1.id}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">manifest</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">vmoperator.vmware.com/v1alpha3</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">VirtualMachine</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">${input.textField_6e3db194}-${env.shortDeploymentId}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">className</span><span class="p">:</span><span class="w"> </span><span class="l">tiny</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">imageName</span><span class="p">:</span><span class="w"> </span><span class="l">vmi-bee6d37ed943e7fc7</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">powerState</span><span class="p">:</span><span class="w"> </span><span class="l">PoweredOn</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">storageClass</span><span class="p">:</span><span class="w"> </span><span class="l">vks-storage-policy</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">network</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">interfaces</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">              </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">eth0</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="nt">network</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">vm</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                  </span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">SubnetSet</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">bootstrap</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">cloudInit</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">              </span><span class="nt">cloudConfig</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="nt">runcmd</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                  </span>- <span class="nt">&#39;disable_vmware_customization</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="s1">&#39;
</span></span></span><span class="line"><span class="cl"><span class="s1">      wait:
</span></span></span><span class="line"><span class="cl"><span class="s1">        conditions:
</span></span></span><span class="line"><span class="cl"><span class="s1">          - type: VirtualMachineCreated
</span></span></span><span class="line"><span class="cl"><span class="s1">            status: &#39;</span><span class="kc">True</span><span class="l">&#39;</span><span class="w">
</span></span></span></code></pre></div><p>To make it easier, I&rsquo;ll split the YAML into sections and describe where changes need to be made.
The vSphere namespace must be specified here. This can be found under Built &amp; Deploy -&gt; Services -&gt; Overview.
Alternatively, it can also be found in vCenter or under Manage &amp; Govern -&gt; Namespaces.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-YAML" data-lang="YAML"><span class="line"><span class="cl"><span class="nt">resources</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">CCI_Supervisor_Namespace_1</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">type</span><span class="p">:</span><span class="w"> </span><span class="l">CCI.Supervisor.Namespace</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">properties</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">a-team-space-8sjpd</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">existing</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span></code></pre></div><ul>
<li>
<p>The VM class must be adjusted in the VM block, as I am using a custom class called Tiny. The existing VM classes can be found under Build &amp; Deploy -&gt; Services -&gt; Virtual Machine -&gt; VM Classes. One of the standard classes, for example, is
<em><strong>best-effort-xsmall</strong></em>.</p>
</li>
<li>
<p>Image Name must be replaced with the image to be used. This can be found under Build &amp; Deploy -&gt; Content Hub -&gt; VM Images, The image identifier must be used.</p>
</li>
<li>
<p>The storage class must be adapted to the existing one. This can be found under Build &amp; Deploy -&gt; Services -&gt; Volume -&gt; Storage Classes.</p>
</li>
<li>
<p>Finally, the network must be adjusted. If you delete the entire block from Network, the default SubnetSet will be used, on which DHCP is not active. In the previous steps, I created my own SubnetSet with the name vm.</p>
</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-YAML" data-lang="YAML"><span class="line"><span class="cl"><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">className</span><span class="p">:</span><span class="w"> </span><span class="l">tiny</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">imageName</span><span class="p">:</span><span class="w"> </span><span class="l">vmi-bee6d37ed943e7fc7</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">powerState</span><span class="p">:</span><span class="w"> </span><span class="l">PoweredOn</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">storageClass</span><span class="p">:</span><span class="w"> </span><span class="l">vks-storage-policy</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">network</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">interfaces</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">eth0</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">network</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">vm</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">SubnetSet</span><span class="w">
</span></span></span></code></pre></div><p>This part does not need to be adjusted, but I would still like to briefly explain why it is included. Without this adjustment, the VM is created and connected to the correct network, but the Alpine VM is always disconnected afterwards.
This can be changed via the vCenter, but that defeats the purpose. The user who will later consume the VMs should not have access to the vCenter.
I&rsquo;ve had this problem with various Linux distributions, but I can&rsquo;t say whether it&rsquo;s a general Linux issue.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-YAML" data-lang="YAML"><span class="line"><span class="cl"><span class="nt">bootstrap</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">cloudInit</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">cloudConfig</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">runcmd</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span>- <span class="nt">&#39;disable_vmware_customization</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="l">&#39;</span><span class="w">
</span></span></span></code></pre></div><p>The Bluebrint should now look more or less like this.</p>
<figure><a href="23.png"><picture><source srcset="/vcf9-auto/23_hu_e8e109c7af860955.png" type="image/png">
          <img
            src="/vcf9-auto/23_hu_e8e109c7af860955.png"alt="Subnet Set"width="1714"
            height="1319"/>
        </picture></a><figcaption><p>Subnet Set (click to enlarge)</p></figcaption></figure>
<p>Using the Test button, you can validate the YAML to ensure that at least the formatting is correct.
Using Deploy, you can run a deployment test. Once this has been successful, you can use Version to publish the current version in the catalog and make it accessible to users.
I recorded a short test video to show how it all works in practice. To do this, I used an advanced project user with limited rights to the project.</p>
<div class="video-container">
  <iframe width="560" height="315"
    src="https://www.youtube-nocookie.com/embed/WwIPaExRn0A"
    frameborder="0"
    allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
    allowfullscreen>
  </iframe>
</div>
<h3 id="bonus-round---deploying-a-kubernetes-cluster">Bonus Round - deploying a Kubernetes Cluster</h3>
<p>Spoilers: this isn&rsquo;t a perfect blueprint, but it just shows that you can do more than just deploy VMs with VCF Automation in All Apps mode.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-YAML" data-lang="YAML"><span class="line"><span class="cl"><span class="nt">formatVersion</span><span class="p">:</span><span class="w"> </span><span class="m">1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">inputs</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">textField_6e3db193</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">type</span><span class="p">:</span><span class="w"> </span><span class="l">string</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">title</span><span class="p">:</span><span class="w"> </span><span class="l">clustername</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">description</span><span class="p">:</span><span class="w"> </span><span class="l">DNS-kompatibler Name (lowercase, RFC-konform)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">pattern</span><span class="p">:</span><span class="w"> </span><span class="l">^[a-z0-9]([a-z0-9-]{0,61}[a-z0-9])?$</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">resources</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">CCI_Supervisor_Namespace_1</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">type</span><span class="p">:</span><span class="w"> </span><span class="l">CCI.Supervisor.Namespace</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">properties</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">a-team-space-8sjpd</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">existing</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">Kubernetes_Cluster_1</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">type</span><span class="p">:</span><span class="w"> </span><span class="l">CCI.Supervisor.Resource</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">properties</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">context</span><span class="p">:</span><span class="w"> </span><span class="l">${resource.CCI_Supervisor_Namespace_1.id}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">manifest</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">cluster.x-k8s.io/v1beta1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">Cluster</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">${input.textField_6e3db193}-${env.shortDeploymentId}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">clusterNetwork</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">pods</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">              </span><span class="nt">cidrBlocks</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span>- <span class="m">192.168.156.0</span><span class="l">/20</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">services</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">              </span><span class="nt">cidrBlocks</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span>- <span class="m">10.96.0.0</span><span class="l">/12</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">serviceDomain</span><span class="p">:</span><span class="w"> </span><span class="l">cluster.local</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">topology</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">class</span><span class="p">:</span><span class="w"> </span><span class="l">builtin-generic-v3.4.0</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">classNamespace</span><span class="p">:</span><span class="w"> </span><span class="l">vmware-system-vks-public</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">version</span><span class="p">:</span><span class="w"> </span><span class="l">v1.33.3---vmware.1-fips-vkr.1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">variables</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">              </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">vmClass</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="nt">value</span><span class="p">:</span><span class="w"> </span><span class="l">best-effort-small</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">              </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">storageClass</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="nt">value</span><span class="p">:</span><span class="w"> </span><span class="l">vks-storage-policy</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">controlPlane</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">              </span><span class="nt">replicas</span><span class="p">:</span><span class="w"> </span><span class="m">1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">              </span><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="nt">annotations</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                  </span><span class="nt">run.tanzu.vmware.com/resolve-os-image</span><span class="p">:</span><span class="w"> </span><span class="l">os-name=photon, content-library=cl-7685ae58460e6c079</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">workers</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">              </span><span class="nt">machineDeployments</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span>- <span class="nt">class</span><span class="p">:</span><span class="w"> </span><span class="l">node-pool</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">kubernetes-cluster-seil-np-1xf6</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                  </span><span class="nt">replicas</span><span class="p">:</span><span class="w"> </span><span class="m">1</span><span class="w">
</span></span></span></code></pre></div><p>To deploy the cluster, only the namespace needs to be adjusted. The rest should run in every automation instance, as I only use standards here.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-YAML" data-lang="YAML"><span class="line"><span class="cl"><span class="w"> </span><span class="nt">CCI_Supervisor_Namespace_1</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">type</span><span class="p">:</span><span class="w"> </span><span class="l">CCI.Supervisor.Namespace</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">properties</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">a-team-space-8sjpd</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">existing</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span></code></pre></div><figure><a href="24.png"><picture><source srcset="/vcf9-auto/24_hu_e1055aeacd50761f.png" type="image/png">
          <img
            src="/vcf9-auto/24_hu_e1055aeacd50761f.png"alt="Automation Instances"width="1709"
            height="1308"/>
        </picture></a><figcaption><p>Automation Instances Set (click to enlarge)</p></figcaption></figure>
<figure><a href="25.png"><picture><source srcset="/vcf9-auto/25_hu_99d15701f04ea748.png" type="image/png">
          <img
            src="/vcf9-auto/25_hu_99d15701f04ea748.png"alt="Cluster in vCenter"width="1045"
            height="578"/>
        </picture></a><figcaption><p>Automation workload in vCenter (click to enlarge)</p></figcaption></figure>
<h2 id="conclusion">Conclusion</h2>
<p>VCF 9 Automation is not just a simple update to the familiar automation solution; it is a new product. In this blog, I have covered some super basic tasks, but there is much more that can be done, and I have only scratched the surface.
A general rethink of how VMs are managed is needed, because if my VM has been deployed in a vSphere namespace, it is no longer managed via vCenter.
This should also make it clear that a simple migration from Automation 8.X to Automation 9 is not easily possible (unless you use Legacy Mode). However, this is the future of the VCF platform.
True multitenancy can only be achieved with VCF Automation. A rethink is necessary, and I look forward to exciting new possibilities with Automation.</p>
]]></content>
		</item>
		
		<item>
			<title>VCF9 - Building a VCF Operations Dashboard</title>
			<link>https://sdn-warrior.org/posts/vcf9-operations-dashboard/</link>
			<pubDate>Sat, 20 Dec 2025 03:00:00 +0100</pubDate>
			
			<guid>https://sdn-warrior.org/posts/vcf9-operations-dashboard/</guid>
			<description><![CDATA[A short blog about the joys of creating a dashboard in VCF 9 Operations.]]></description>
			<content type="html"><![CDATA[<h2 id="introduction">Introduction</h2>
<p>I currently have two different systems in my home lab for monitoring things.
I started everything but never really finished it.
I have a Grafana dashboard for power consumption and an Uptime Kuma to check the general availability of systems.
As I have to run a VCF Operations anyway so that I can license my Baselab, I thought, why not use Operations for monitoring purposes, since it&rsquo;s there anyway.</p>
<p>How hard can it be? Spoiler alert: I never thought I&rsquo;d be building a Docker container, using a proxy, and converting things with Python, but let&rsquo;s dive down the rabbit hole.</p>
<h2 id="first-things-first-where-do-we-start">First things first, where do we start?</h2>
<p>In VCF 9, the Management Pack Builder is integrated into Operations. Management Pack Builder? Yes, that&rsquo;s the tool I&rsquo;ve probably seen the most over the last few days.
The MP allows you to integrate external systems or data sources and use them to create metrics and properties. Sounds simple, and it is—in theory.
But first you have to find the tool, because it has been well hidden in Operations.
You can find the MP under Administration -&gt; Integrations -&gt; Marketplace and then access it via the Create Management Pack button or or Developer Center -&gt; Managment Pack Builder.</p>
<h2 id="my-first-management-pack">My first management pack</h2>
<p>I will describe the process using the Unraid API as an example.
My UPS itself does not have an API, but my Unraid storage system can read my UPS data via a USB connection.
Since Unraid 7.2, there is now also an API that provides me with the data - nice.</p>
<p>In principle, however, it can be said that depending on the API, MP can be super easy or hellish. More on that later.
Unraid relies on GraphQL, which is ideal for VCF operations and, after a brief period of familiarization with how to access the data, is very easy to use.
The nice thing is that there is exactly one endpoint, namely <em><strong>/graphql</strong></em>, and Unraid has a built-in Apollo server where you can easily click together your queries.</p>
<h3 id="define-sources">Define sources</h3>
<p>First, a source must be defined in MP, which is usually relatively easy.
But here&rsquo;s the first stumbling block: please use variables if you are using anything other than Basic Auth.</p>
<figure><a href="01.png"><picture><source srcset="/vcf9-ops-dashboard/01_hu_c6f105c30317d252.png" type="image/png">
          <img
            src="/vcf9-ops-dashboard/01_hu_c6f105c30317d252.png"alt="Data Source Connection"width="1678"
            height="1205"/>
        </picture></a><figcaption><p>Data Source Connection (click to enlarge)</p></figcaption></figure>

    <aside class="admonition warning">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-alert-circle">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="8" x2="12" y2="12"></line>
      <line x1="12" y1="16" x2="12.01" y2="16"></line>
   </svg></div><b>Warning</b>
        </div>
        <div class="admonition-content">If credentials are not provided via Basic Auth or variables, the access data is unencrypted in the Management Pack, especially when it is exported and made available to others. Basic Auth should also be avoided, as user data is transmitted unencrypted.</div>
    </aside>
<p>The data source serves as a template, which is used to develop the management pack.
Once the management pack has been created, there are no more credentials in the data source, provided that you have followed the guidelines and not entered any credentials without variables.
If you use basic authentication, variables are used automatically.</p>
<p>The actual configuration is fairly straightforward. you have to select a collector, the API endpoint, and the port.
I have disabled certificate verification, but in production it should of course remain enabled. The base path is optional; for Unraid, graphql would be a good choice.
You don&rsquo;t have to specify a leading /, this is done automatically.
Since Unraid uses API tokens, I created a variable called API-Toke via Custom Authentication.</p>
<figure><a href="02.png"><picture><source srcset="/vcf9-ops-dashboard/02_hu_59b8df1330394a46.png" type="image/png">
          <img
            src="/vcf9-ops-dashboard/02_hu_59b8df1330394a46.png"alt="Global Settings"width="1328"
            height="688"/>
        </picture></a><figcaption><p>Global Settings (click to enlarge)</p></figcaption></figure>
<p>Next are the global settings, which are very API-specific, and I will show another variation with the Veeam API later on.
Unraid is relatively simple here; content type - application/json is more or less standard for most APIs, and my token must be specified in the HTTP header with x-api-key.
The small paste icon allows you to access the defined variables, and everything is automatically entered correctly so that the previously defined variable is used.</p>
<figure><a href="03.png"><picture><source srcset="/vcf9-ops-dashboard/03_hu_7b35f695bac69f21.png" type="image/png">
          <img
            src="/vcf9-ops-dashboard/03_hu_7b35f695bac69f21.png"alt="Test Connection"width="1174"
            height="1019"/>
        </picture></a><figcaption><p>Test Connection (click to enlarge)</p></figcaption></figure>
<p>The final step is to test the connection. If the test is not successful, the connection cannot be saved.
Any query can generally be used. In my example, I display the system version of my Unraid server.
Unraid only uses POST queries. The global HTTP headers are added automatically.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">Content-Type is always included by the MP, whether you want it or not.
If Content-Type is not required, you can specify text/plain. The OPNSense API requires this.</div>
    </aside>
<p>To query data, you have to send a query in the body with Unraid. This is again very specific.</p>
<p>If everything works, you will receive a valid response on the right-hand side and you can save the connection. Step 1 is now successfully complete.</p>
<h3 id="create-requests">Create Requests</h3>
<p>Next, the requests must be created. These requests will then be used later to retrieve the data.</p>
<figure><a href="04.png"><picture><source srcset="/vcf9-ops-dashboard/04_hu_d885d8f6fbc892e7.png" type="image/png">
          <img
            src="/vcf9-ops-dashboard/04_hu_d885d8f6fbc892e7.png"alt="Requests"width="1442"
            height="620"/>
        </picture></a><figcaption><p>Requests (click to enlarge)</p></figcaption></figure>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">If you need to create additional requests afterwards (after the management pack has been installed) because data is missing or the API has changed, you must uninstall the management pack and reinstall the updated version.
Afterwards, existing dashboards must be adjusted again.</div>
    </aside>
<figure><a href="05.png"><picture><source srcset="/vcf9-ops-dashboard/05_hu_9c01d4d57a22c8cf.png" type="image/png">
          <img
            src="/vcf9-ops-dashboard/05_hu_9c01d4d57a22c8cf.png"alt="API Requests"width="1519"
            height="1108"/>
        </picture></a><figcaption><p>API Requests (click to enlarge)</p></figcaption></figure>
<p>In my example, I query the utilization of my UPS via the Unraid API so that I can use the data later in an object.
And as you can see here, I didn&rsquo;t follow my own recommendation for this request and my API token is hardcoded in the request. Please do better.
In my defense, that was my first request, and I will adjust it in the next update of the management pack.</p>
<h3 id="create-objects">Create Objects</h3>
<p>Next, the objects must be created. Each object needs at least one identifier, which must be unique, and an object can contain several metrics and/or properties.
Metrics and properties are not mandatory. An object can therefore consist of metrics only or properties only.
The difference between metrics and properties is quite simple: metrics can be calculated and must be of the decimal data type, while properties are strings.
The unique ID is implemented via a property. Each object can only receive data via one request.</p>
<figure><a href="06.png"><picture><source srcset="/vcf9-ops-dashboard/06_hu_1311526b831a392c.png" type="image/png">
          <img
            src="/vcf9-ops-dashboard/06_hu_1311526b831a392c.png"alt="Objects"width="1669"
            height="1064"/>
        </picture></a><figcaption><p>Objects (click to enlarge)</p></figcaption></figure>
<p>When selecting the attributes, it is important to pay close attention to what is selected and always rely on the result of the API.
In the end, clearly assignable values are required. Arrays such as upsDevices.* or data.* are not suitable for metrics.
In this case, data.upsDevices.* must be selected, as all results always belong to exactly one upsDevice and can therefore ideally represent an object.
If there are several UPS systems in upsDevices, several objects are automatically created.</p>
<figure><a href="07.png"><picture><source srcset="/vcf9-ops-dashboard/07_hu_f944e41555b8bcbf.png" type="image/png">
          <img
            src="/vcf9-ops-dashboard/07_hu_f944e41555b8bcbf.png"alt="Properties and Metrics"width="1678"
            height="906"/>
        </picture></a><figcaption><p>Properties and Metrics (click to enlarge)</p></figcaption></figure>
<p>Selecting the metrics and properties is super simple, but at the same time it&rsquo;s the biggest problem when dealing with APIs.
Unfortunately, you can&rsquo;t convert the results in any other way; it&rsquo;s either a string or a decimal.
It&rsquo;s the same as when you only have a hammer, then suddenly everything looks like a nail.
In my cases with different APIs, almost everything was a stirng. At least with Decimal, you can still specify the most common units, such as %, bit, byte, byte/s, and so on.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">A little tip: if you open the metric or property using the small &raquo; arrows, you will see a preview of the future metric or property.</div>
    </aside>
<p>Finally, you just need to select a property as the object instance name and an object ID, which must of course be unique. Since I only have one UPS, I can use the UPS name as both the object instance name and the object ID.</p>
<p>Finally, a verification must be performed before the Management Pack can be installed.
Relationshops and events are optional, and for my purposes, I do not need them at this time and will therefore leave them out.</p>
<figure><a href="08.png"><picture><source srcset="/vcf9-ops-dashboard/08_hu_e2d57c7da7fe8470.png" type="image/png">
          <img
            src="/vcf9-ops-dashboard/08_hu_e2d57c7da7fe8470.png"alt="Verify"width="1442"
            height="888"/>
        </picture></a><figcaption><p>Verify (click to enlarge)</p></figcaption></figure>
<p>Congratulations, we have now created our first objects and metrics definition.</p>
<h2 id="side-quest-incompatible-apis-or-rather-what-to-do-when-i-need-a-metric-but-only-get-properties">Side quest: Incompatible APIs, or rather, what to do when I need a metric but only get properties.</h2>
<p>Well, friends, I had this problem with the OPNSense API.
I can create objects perfectly well using the method described, but the API only returns a string with, for example, 6.7 ms for the latency.
The Management Pack Builder can only interpret this as a string because of the unit that is attached, which means I can&rsquo;t use it for calculations or to display a metric history, for example.
But there is a solution.</p>
<h3 id="fastapi--phyton--beautiful-metrics">FastAPI + Phyton = beautiful metrics</h3>
<p>What is FastAPI? FastAPI was created to build type-safe REST APIs quickly and easily with Python. I use uvicorn as a fast and lightweight web server.
I did the development in PyCharm. The tool runs on my system as a docker. In PyCharm, I created a new project and created exactly two files: requirements.txt and my actual proxy.py.
The required libraries are listed in requirements.txt.</p>
<p>requirements.txt:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-shell" data-lang="shell"><span class="line"><span class="cl">fastapi
</span></span><span class="line"><span class="cl">uvicorn<span class="o">[</span>standard<span class="o">]</span>
</span></span><span class="line"><span class="cl">requests
</span></span></code></pre></div><p>proxy.py:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-shell" data-lang="shell"><span class="line"><span class="cl">import os, re
</span></span><span class="line"><span class="cl">import requests
</span></span><span class="line"><span class="cl">from fastapi import FastAPI, Request, Response, HTTPException
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="nv">UPSTREAM</span> <span class="o">=</span> os.environ.get<span class="o">(</span><span class="s2">&#34;UPSTREAM_BASE&#34;</span>, <span class="s2">&#34;https://xxx.xxx&#34;</span><span class="o">)</span>
</span></span><span class="line"><span class="cl"><span class="nv">VERIFY_TLS</span> <span class="o">=</span> os.environ.get<span class="o">(</span><span class="s2">&#34;VERIFY_TLS&#34;</span>, <span class="s2">&#34;false&#34;</span><span class="o">)</span>.lower<span class="o">()</span> <span class="o">==</span> <span class="s2">&#34;true&#34;</span>
</span></span><span class="line"><span class="cl"><span class="nv">TIMEOUT</span> <span class="o">=</span> float<span class="o">(</span>os.environ.get<span class="o">(</span><span class="s2">&#34;TIMEOUT&#34;</span>, <span class="s2">&#34;10&#34;</span><span class="o">))</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="nv">app</span> <span class="o">=</span> FastAPI<span class="o">()</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="nv">_ms</span> <span class="o">=</span> re.compile<span class="o">(</span>r<span class="s2">&#34;^\s*([0-9]+(?:\.[0-9]+)?)\s*ms\s*</span>$<span class="s2">&#34;</span>, re.I<span class="o">)</span>
</span></span><span class="line"><span class="cl"><span class="nv">_pct</span> <span class="o">=</span> re.compile<span class="o">(</span>r<span class="s2">&#34;^\s*([0-9]+(?:\.[0-9]+)?)\s*%\s*</span>$<span class="s2">&#34;</span>, re.I<span class="o">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">def parse_ms<span class="o">(</span>v<span class="o">)</span>:
</span></span><span class="line"><span class="cl">    <span class="nv">m</span> <span class="o">=</span> _ms.match<span class="o">(</span>v or <span class="s2">&#34;&#34;</span><span class="o">)</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> float<span class="o">(</span>m.group<span class="o">(</span>1<span class="o">))</span> <span class="k">if</span> m <span class="k">else</span> None
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">def parse_pct<span class="o">(</span>v<span class="o">)</span>:
</span></span><span class="line"><span class="cl">    <span class="nv">m</span> <span class="o">=</span> _pct.match<span class="o">(</span>v or <span class="s2">&#34;&#34;</span><span class="o">)</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> float<span class="o">(</span>m.group<span class="o">(</span>1<span class="o">))</span> <span class="k">if</span> m <span class="k">else</span> None
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">@app.get<span class="o">(</span><span class="s2">&#34;/api/routing/settings/search_gateway&#34;</span><span class="o">)</span>
</span></span><span class="line"><span class="cl">def search_gateway<span class="o">(</span>request: Request<span class="o">)</span>:
</span></span><span class="line"><span class="cl">    <span class="nv">auth</span> <span class="o">=</span> request.headers.get<span class="o">(</span><span class="s2">&#34;authorization&#34;</span><span class="o">)</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> not auth:
</span></span><span class="line"><span class="cl">        raise HTTPException<span class="o">(</span><span class="nv">status_code</span><span class="o">=</span>401, <span class="nv">detail</span><span class="o">=</span><span class="s2">&#34;Missing Authorization header&#34;</span><span class="o">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="nv">url</span> <span class="o">=</span> f<span class="s2">&#34;{UPSTREAM}/api/routing/settings/search_gateway&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="nv">headers</span> <span class="o">=</span> <span class="o">{</span><span class="s2">&#34;Authorization&#34;</span>: auth, <span class="s2">&#34;Accept&#34;</span>: <span class="s2">&#34;application/json&#34;</span><span class="o">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    try:
</span></span><span class="line"><span class="cl">        <span class="nv">r</span> <span class="o">=</span> requests.get<span class="o">(</span>url, <span class="nv">headers</span><span class="o">=</span>headers, <span class="nv">timeout</span><span class="o">=</span>TIMEOUT, <span class="nv">verify</span><span class="o">=</span>VERIFY_TLS<span class="o">)</span>
</span></span><span class="line"><span class="cl">    except requests.RequestException as e:
</span></span><span class="line"><span class="cl">        raise HTTPException<span class="o">(</span><span class="nv">status_code</span><span class="o">=</span>502, <span class="nv">detail</span><span class="o">=</span>f<span class="s2">&#34;Upstream request failed: {e}&#34;</span><span class="o">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">if</span> r.status_code &gt;<span class="o">=</span> 400:
</span></span><span class="line"><span class="cl">        <span class="k">return</span> Response<span class="o">(</span>
</span></span><span class="line"><span class="cl">            <span class="nv">content</span><span class="o">=</span>r.text,
</span></span><span class="line"><span class="cl">            <span class="nv">status_code</span><span class="o">=</span>r.status_code,
</span></span><span class="line"><span class="cl">            <span class="nv">media_type</span><span class="o">=</span>r.headers.get<span class="o">(</span><span class="s2">&#34;content-type&#34;</span>, <span class="s2">&#34;text/plain&#34;</span><span class="o">)</span>,
</span></span><span class="line"><span class="cl">        <span class="o">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    try:
</span></span><span class="line"><span class="cl">        <span class="nv">data</span> <span class="o">=</span> r.json<span class="o">()</span>
</span></span><span class="line"><span class="cl">    except ValueError:
</span></span><span class="line"><span class="cl">        raise HTTPException<span class="o">(</span><span class="nv">status_code</span><span class="o">=</span>502, <span class="nv">detail</span><span class="o">=</span><span class="s2">&#34;Upstream returned invalid JSON&#34;</span><span class="o">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">for</span> row in data.get<span class="o">(</span><span class="s2">&#34;rows&#34;</span>, <span class="o">[])</span>:
</span></span><span class="line"><span class="cl">        <span class="k">if</span> isinstance<span class="o">(</span>row.get<span class="o">(</span><span class="s2">&#34;delay&#34;</span><span class="o">)</span>, str<span class="o">)</span>:
</span></span><span class="line"><span class="cl">            <span class="nv">v</span> <span class="o">=</span> parse_ms<span class="o">(</span>row<span class="o">[</span><span class="s2">&#34;delay&#34;</span><span class="o">])</span>
</span></span><span class="line"><span class="cl">            <span class="k">if</span> v is not None:
</span></span><span class="line"><span class="cl">                row<span class="o">[</span><span class="s2">&#34;delay_ms&#34;</span><span class="o">]</span> <span class="o">=</span> v
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="k">if</span> isinstance<span class="o">(</span>row.get<span class="o">(</span><span class="s2">&#34;stddev&#34;</span><span class="o">)</span>, str<span class="o">)</span>:
</span></span><span class="line"><span class="cl">            <span class="nv">v</span> <span class="o">=</span> parse_ms<span class="o">(</span>row<span class="o">[</span><span class="s2">&#34;stddev&#34;</span><span class="o">])</span>
</span></span><span class="line"><span class="cl">            <span class="k">if</span> v is not None:
</span></span><span class="line"><span class="cl">                row<span class="o">[</span><span class="s2">&#34;stddev_ms&#34;</span><span class="o">]</span> <span class="o">=</span> v
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="k">if</span> isinstance<span class="o">(</span>row.get<span class="o">(</span><span class="s2">&#34;loss&#34;</span><span class="o">)</span>, str<span class="o">)</span>:
</span></span><span class="line"><span class="cl">            <span class="nv">v</span> <span class="o">=</span> parse_pct<span class="o">(</span>row<span class="o">[</span><span class="s2">&#34;loss&#34;</span><span class="o">])</span>
</span></span><span class="line"><span class="cl">            <span class="k">if</span> v is not None:
</span></span><span class="line"><span class="cl">                row<span class="o">[</span><span class="s2">&#34;loss_pct&#34;</span><span class="o">]</span> <span class="o">=</span> v
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> data
</span></span></code></pre></div><p>The whole thing is started in my development environment with <em><strong>uvicorn proxy:app &ndash;host 0.0.0.0 &ndash;port 666</strong></em></p>
<figure><a href="09.png"><picture><source srcset="/vcf9-ops-dashboard/09_hu_61bb681e3ba8fe8.png" type="image/png">
          <img
            src="/vcf9-ops-dashboard/09_hu_61bb681e3ba8fe8.png"alt="PyCharm"width="1717"
            height="1250"/>
        </picture></a><figcaption><p>PyCharm (click to enlarge)</p></figcaption></figure>
<p>Of course, this is only intended for development purposes.
Later, I built a Docker container from it that runs on my Unraid server, is backed up, and also starts automatically.
It is important to note that authentication is not touched and is simply passed on.
In Python, I parse the values for delay, stddev, and loss and remove the units.
However, I keep the original values and simply output additional values that do not have units.
FastAPI thus serves as an API proxy, and I no longer address my OPNSense directly, but rather the FastAPI Docker container, which then returns a usable API to me.</p>
<p>But back to the actual topic.</p>
<h2 id="building-dashboards">Building Dashboards</h2>
<p>I now have my object definitions, and in order to actually receive data, I need to create an account under Administration -&gt; Integrations.
Unraid (the created management pack) is now also available as an account type, and I need to enter the data for the API here.
Host name, collector, port, SSL configuration, and so on.</p>
<p>To view my objects now, you can find them under <em><strong>Inventory -&gt; Integrations</strong></em>.
The object type <em><strong>UPSLoad</strong></em> that I created can be found in the Unraid integration.
If the API returned more than one UPS here, there would be two objects of the object type <em><strong>UPSload</strong></em>.
Within the object, you can find the queried values in the Metric Power Load Percentage.
It takes at least 5 minutes for the first values to appear.</p>
<figure><a href="10.png"><picture><source srcset="/vcf9-ops-dashboard/10_hu_6ce32e28e890c317.png" type="image/png">
          <img
            src="/vcf9-ops-dashboard/10_hu_6ce32e28e890c317.png"alt="UPS Object"width="1715"
            height="1239"/>
        </picture></a><figcaption><p>UPS Object (click to enlarge)</p></figcaption></figure>
<p>You can now work with these values. However, I only have a percentage value and I would prefer to see the watts displayed, because I can imagine more under 300 watts than under 34% usage. But how does that work?</p>
<h3 id="super-metrics">Super Metrics</h3>
<p>Super metrics are values that can be calculated using other metrics.
You can also use super metrics for super metrics. However, it is important to remember that the source metric must be available before the super metric can be calculated.
It is also important that the super metric is assigned to the correct object type. In my case, the object type is UPSLoad.</p>
<p>A Super Metric must also be assigned to a policy. If this is forgotten, there will be no values.
I use the default policy that is used for every object here. However, you could also write a specific policy.</p>
<figure><a href="11.png"><picture><source srcset="/vcf9-ops-dashboard/11_hu_5358e5f75aa6a916.png" type="image/png">
          <img
            src="/vcf9-ops-dashboard/11_hu_5358e5f75aa6a916.png"alt="Super Metrics"width="1719"
            height="935"/>
        </picture></a><figcaption><p>Super Metrics (click to enlarge)</p></figcaption></figure>
<p>The rest is simple math.
Since I know the maximum power of the UPS, i.e., how many watts 100% is, I can easily convert the percentage into watts.</p>
<p>After 1-2 intervals have passed, the Super Metric is displayed in the object and can be used. This can be checked via Inventory.

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">The default interval is 5 minutes, but it can be adjusted to a minimum of 1 minute for objects. The query interval should be adjusted depending on the object and the data.</div>
    </aside></p>
<h3 id="building-the-dashboard">Building the Dashboard</h3>
<p>Now that the basics are done and the first objects are appearing in Operations, I can start building a dashboard.
A new dashboard can be created under <em><strong>Infrastructure Operations -&gt; Dashboards &amp; Reports</strong></em>.
There are quite a few widgets available for displaying various metrics. For watts, a metric chart is useful, as it can show consumption over a certain period of time.
Each widget has a refresh counter (which only determines the interval at which the widget is updated, not the object).
Another important option is <em><strong>Self Provider</strong></em>. On means the widget is independent of other widgets.
It does not take input from other widgets, hence it requires input to be configured in the widget itself.</p>
<figure><a href="12.png"><picture><source srcset="/vcf9-ops-dashboard/12_hu_1a86fb706f0baad5.png" type="image/png">
          <img
            src="/vcf9-ops-dashboard/12_hu_1a86fb706f0baad5.png"alt="Output Data"width="1053"
            height="717"/>
        </picture></a><figcaption><p>Output Data (click to enlarge)</p></figcaption></figure>
<p>Under Input Data, you can choose whether you want to have a complete object (or several) as input or just metrics.
If I take the object as input, I can also display properties. The actual output of the widget is then configured under Output Data.
Input and output must match. If I select an object as input that is not of the same object type as my output metric, no content will be displayed.</p>
<p>The finished widget now looks something like this.</p>
<figure><a href="13.png"><picture><source srcset="/vcf9-ops-dashboard/13_hu_454d318f5128c523.png" type="image/png">
          <img
            src="/vcf9-ops-dashboard/13_hu_454d318f5128c523.png"alt="Widget"width="824"
            height="349"/>
        </picture></a><figcaption><p>Widget (click to enlarge)</p></figcaption></figure>
<p>It wasn&rsquo;t that difficult, and now all the basics are in place to build a nice dashboard.
Of course, there&rsquo;s a lot more you can do, such as symptom definitions that trigger something when certain metrics occur or properties change.
For example, I wrote a symptom definition that reduces the health value of a Veeam object if the backup status is not successful. This also triggers an alarm.</p>
<p>But I think I&rsquo;d rather write about that separately, because it&rsquo;s a topic in its own right. The same goes for the Health Metric, which automatically assigns operations to each object.</p>
<p>My final dashboard now looks like this.</p>
<figure><a href="14.png"><picture><source srcset="/vcf9-ops-dashboard/14_hu_e58556fd28bf023e.png" type="image/png">
          <img
            src="/vcf9-ops-dashboard/14_hu_e58556fd28bf023e.png"alt="Final Dashboard"width="1713"
            height="1198"/>
        </picture></a><figcaption><p>Final Dashboard (click to enlarge)</p></figcaption></figure>
<h3 id="side-quest---bearer-token">Side Quest - Bearer Token</h3>
<p>The Veeam API uses bearer tokens with limited runtime, which means that an API request must first be created with a username and password to generate a token before a request to retrieve data can be made. Fortunately, the Management Pack Builder also supports this, but it&rsquo;s all a bit fiddly, so I&rsquo;ll describe it here.
For Veeam to work, authentication in the Data Source Connection must be set to Custom, otherwise we will not be able to create a secure variable. It is also important to enable <em><strong>Use Session Authentication</strong></em>.</p>
<figure><a href="15.png"><picture><source srcset="/vcf9-ops-dashboard/15_hu_6115d8fbfde1e60e.png" type="image/png">
          <img
            src="/vcf9-ops-dashboard/15_hu_6115d8fbfde1e60e.png"alt="Veeam"width="1677"
            height="1208"/>
        </picture></a><figcaption><p>Veeam (click to enlarge)</p></figcaption></figure>
<p>From here on, things change slightly, as we now have to perform two requests.
It is also important to note that the username and password must be included in the body for Veeam. A variable is also used for the password here.
From here on, things change slightly, as we now have to perform two requests. It is also important to note that Veeam also requires special headers. However, as Veeam uses Swagger, you can extract all of this information from there.</p>
<figure><a href="16.png"><picture><source srcset="/vcf9-ops-dashboard/16_hu_9b8e6f3929f3c328.png" type="image/png">
          <img
            src="/vcf9-ops-dashboard/16_hu_9b8e6f3929f3c328.png"alt="Veeam Session"width="1668"
            height="1037"/>
        </picture></a><figcaption><p>Veeam Session (click to enlarge)</p></figcaption></figure>
<p>After successfully creating a session, we can now see the token under Global Settings in Session Fields.
Session Fields are automatic variables that must be used for the Test Connection and also for the Release Session request.
We don&rsquo;t need to change anything in Global Settings, and since I didn&rsquo;t want to censor another screenshot, I decided not to bother.</p>
<figure><a href="17.png"><picture><source srcset="/vcf9-ops-dashboard/17_hu_898ea424b9728dd0.png" type="image/png">
          <img
            src="/vcf9-ops-dashboard/17_hu_898ea424b9728dd0.png"alt="Release Session"width="1684"
            height="1210"/>
        </picture></a><figcaption><p>Release Session (click to enlarge)</p></figcaption></figure>
<p>A release session request is nothing more than a logout command, after which the data is retrieved. This should be used to ensure that the session is closed properly.</p>
<p>As a final step, a test connection must be performed so that the DataSource can be saved. I simply queried the session information for this, but ultimately it doesn&rsquo;t matter what is queried here as long as a valid result is returned.</p>
<figure><a href="18.png"><picture><source srcset="/vcf9-ops-dashboard/18_hu_d39388b59b6caccf.png" type="image/png">
          <img
            src="/vcf9-ops-dashboard/18_hu_d39388b59b6caccf.png"alt="Final Test"width="1679"
            height="1210"/>
        </picture></a><figcaption><p>Final Test (click to enlarge)</p></figcaption></figure>
<p>If the data source is set up in this way, the Management Pack is also secure for export, as no credentials are stored. When integrating into Operations, the username and password must then be provided. The session is created, the token is generated, and the session is closed cleanly with each data retrieval cycle.
Everything else is exactly the same as previously described for the Unraid API.</p>
<h2 id="summary">Summary</h2>
<p>VCF Operations is a powerful tool, albeit a little outdated in some areas, particularly when it comes to handling APIs, where you can still see the product&rsquo;s roots and age. Nevertheless, with a little work, Operations can be expanded wonderfully to monitor more than just the VMware environment. If you want, you can build a weather report into the dashboard – why not?</p>
<p>Of course, this article only scratches the surface; we haven&rsquo;t talked about views yet, or how the whole Alarming works, or how to work with colors, and so on.
Considering that it&rsquo;s already almost 3 a.m. and the topic is so extensive, I&rsquo;ll write a more in-depth article if there&rsquo;s interest.
I hope I&rsquo;ve been able to help you make even better use of your operations.</p>
<p>Looking at my final dashboard, I&rsquo;m quite excited about how much you can achieve with it in a short amount of time. I only looked into the topic about a week ago and have been cautiously playing around with it.
My thanks also go to <a href="https://www.vcrocs.info/">Dale Hassinger</a>, who was kind enough to help me. Check out his blog.</p>
<p>And now all that remains for me to say is, happy creating and Merry Christmas!</p>
]]></content>
		</item>
		
		<item>
			<title>VCF9 - Dark Site Edge - Part 2</title>
			<link>https://sdn-warrior.org/posts/vcf9-dark-site-edge-part2/</link>
			<pubDate>Tue, 09 Dec 2025 00:00:00 +0100</pubDate>
			
			<guid>https://sdn-warrior.org/posts/vcf9-dark-site-edge-part2/</guid>
			<description><![CDATA[A short blog on how to deploy a Dark Site Edge in VCF9 - Part 2]]></description>
			<content type="html"><![CDATA[<h2 id="introduction">Introduction</h2>
<p>In my last <a href="https://sdn-warrior.org/posts/vcf9-dark-site-edge/">Dark Site</a> article, I showed how to deploy a Dark Site Edge on a host (with a temporary second host).
However, this didn&rsquo;t sit well with me, as I believe it should also be possible with just one host and without any additional tools.
Spoiler alert: it works, and today I&rsquo;m going to write down how to do it.</p>
<p>Of course, we need a few workarounds here, and I was unable to create the setup with external storage such as NFS.
So today I will use vSAN ESA, even though it requires significantly more resources than NFS.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">As always, please do not use in production. In addition, the ESA workaround requires VCF 9.0.1, but you can do without the vSAN ESA Mock VIB.</div>
    </aside>
<p>So buckle up, we&rsquo;re going in.</p>
<h2 id="hardware-modification">Hardware modification</h2>
<p>Since I want to use vSAN for this deployment, because a greenfield deployment with external storage is not possible with only one node,
I first need to expand my hardware. Fortunately,
I purchased my NVMes before the price chaos.
I need a total of three NVMes. I have a 1 TB NVMe (PCIe4) for memory tiering, a 2 TB NVMe (PCIe3) for local storage and the actual ESX. This could be a much smaller disk, but I had it anyway. And finally, another 2 TB NVMe (PCIe3) for vSAN.</p>
<figure><a href="01.png"><picture><source srcset="/vcf9-darksite2/01_hu_2ffa5e1037e1582.png" type="image/png">
          <img
            src="/vcf9-darksite2/01_hu_2ffa5e1037e1582.png"alt="NVMe"width="756"
            height="1008"/>
        </picture></a><figcaption><p>NVMe - MS-01 (click to enlarge)</p></figcaption></figure>
<p>I could have saved myself an NVMe and booted the ESX via a USB stick, because then you can share the memory tiering disk with the system disk, but I simply only have one USB stick left and I need it for firmware updates, plus I don&rsquo;t particularly like USB sticks.
Now you might ask why I have two PCIe3 disks—well, I don&rsquo;t. However, the MS-01 only uses one slot with PCIe4 by default BIOS settings, and the others run with PCIe3, as otherwise there is a risk of overheating.
In addition, the MS-01 will serve another purpose later on, as it will sooner or later become part of my Nutanix cluster, and I will need the disks for that anyway.</p>
<h2 id="prerequisites">Prerequisites</h2>
<p>As with my first attempt, a deployed fleet is required. This must consist of at least one VCF Operations, vCenter, NSX Manager, VCF Operations Cloud Proxy, and one Fleet Manager.</p>
<h2 id="workarounds">Workarounds</h2>
<p>What would an SDN-warrior deployment be without workarounds? That&rsquo;s right, work and no fun.</p>
<h3 id="workaround-1-ep-cores">Workaround 1 (E&amp;P Cores):</h3>
<p>Since I use the MS-01, the first thing that comes into play is the workaround for the Big.LITTLE architecture. I have described this <a href="https://sdn-warrior.org/posts/nuc/#using-pe-cores">here</a>.</p>
<h3 id="workaround-2-memory-tiering">Workaround 2 (Memory Tiering):</h3>
<p>Memory tiering is also used, as described in the same article like the Big.LITTLE workaround.</p>
<h3 id="workaround-3-single-host-deployment">Workaround 3 (Single host deployment):</h3>
<p>I have described this <a href="https://sdn-warrior.org/posts/vcf9-dark-site-edge/#workaround-4">here</a>.
However, with my 9.0.1 setup, I needed both entries in the feature.properties file.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;feature.vcf.vgl-29121.single.host.domain=true&#34;</span> &gt;&gt; /home/vcf/feature.properties
</span></span><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;feature.vcf.internal.single.host.domain=true&#34;</span> &gt;&gt; /home/vcf/feature.properties
</span></span></code></pre></div><h3 id="workaround-4-bypass-vsan-hcl">Workaround 4 (Bypass vSAN HCL):</h3>
<p>Now we finally come to something new—at least, something new for me.
In VCF 9.0.1, you can finally bypass the HCL check for vSAN ESA. Gone are the days of custom JSON vSAN files and VIB installation on ESX.</p>
<p>To do this, simply log in to the installer via SSH and enter the following in the shell.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;vsan.esa.sddc.managed.disk.claim=true&#34;</span> &gt;&gt; /etc/vmware/vcf/domainmanager/application-prod.properties
</span></span><span class="line"><span class="cl">systemctl restart domainmanager
</span></span></code></pre></div><p>Now that most of the work has been done, we can proceed with the deployment. Just a quick heads-up: there will be a small workaround after the SDDC Manager has been successfully deployed.</p>
<h2 id="deployment">Deployment</h2>
<p>Now comes the really exciting part. Unfortunately, deployment via the guided GUI is not possible with vSAN and single host, but that&rsquo;s what JSON is for.
There are a few good VCF JSON generators available online, such as this <a href="https://feardamhan.com/2025/08/14/introducing-the-vcf-jsongenerator-powershell-module-for-vmware-cloud-foundation/">one</a>, or you can simply use my JSON and modify it.
As with the first Dark Site setup, I am using my existing Fleet, which means I don&rsquo;t have to deploy VCF Operations, Automation, Operations for Logs, or even the Fleet Manager.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">FQDNs, IP addresses, and passwords must be adjusted. To integrate existing resources, you need their SSL thumbprint.
You need the thumbprint from VCF Operations, VCF Fleet Manager and the ESX Host. With “skipEsxThumbprintValidation”: true, you can skip the ESX host verification.</div>
    </aside>
<p>To obtain the SSL thumbprints, simply log in to the ESX server and use the following command.
Adjust the FQDN in the query for each thumbprint.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl"><span class="nb">echo</span> <span class="p">|</span> openssl s_client -connect vcf09-e01-esx01.lab.vcf:443 2&gt;/dev/null <span class="p">|</span> openssl x509 -noout -fingerprint -sha256 <span class="p">|</span> cut -d<span class="o">=</span> -f2
</span></span></code></pre></div><p>Here is the JSON file I used for VCF 9.0.1. Maybe I should create a GitHub repository for things like this.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl"><span class="o">{</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;sddcId&#34;</span>: <span class="s2">&#34;dark-site03&#34;</span>,
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;vcfInstanceName&#34;</span>: <span class="s2">&#34;dark-site03&#34;</span>,
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;workflowType&#34;</span>: <span class="s2">&#34;VCF&#34;</span>,
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;version&#34;</span>: <span class="s2">&#34;9.0.1.0&#34;</span>,
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;ceipEnabled&#34;</span>: false,
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;skipEsxThumbprintValidation&#34;</span>: true,
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;dnsSpec&#34;</span>: <span class="o">{</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;nameservers&#34;</span>: <span class="o">[</span>
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;192.168.11.2&#34;</span>,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34; 192.168.100.254&#34;</span>
</span></span><span class="line"><span class="cl">        <span class="o">]</span>,
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;subdomain&#34;</span>: <span class="s2">&#34;lab.vcf&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="o">}</span>,
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;ntpServers&#34;</span>: <span class="o">[</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;192.168.12.1&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="o">]</span>,
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;vcenterSpec&#34;</span>: <span class="o">{</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;vcenterHostname&#34;</span>: <span class="s2">&#34;vcf09-e03-vcsa.lab.vcf&#34;</span>,
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;rootVcenterPassword&#34;</span>: <span class="s2">&#34;xxx&#34;</span>,
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;vmSize&#34;</span>: <span class="s2">&#34;small&#34;</span>,
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;storageSize&#34;</span>: <span class="s2">&#34;&#34;</span>,
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;dminUserSsoUserName&#34;</span>: <span class="s2">&#34;administrator@vsphere.local&#34;</span>,
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;adminUserSsoPassword&#34;</span>: <span class="s2">&#34;xxx&#34;</span>,
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;ssoDomain&#34;</span>: <span class="s2">&#34;vsphere.local&#34;</span>,
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;useExistingDeployment&#34;</span>: <span class="nb">false</span>
</span></span><span class="line"><span class="cl">    <span class="o">}</span>,
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;clusterSpec&#34;</span>: <span class="o">{</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;clusterName&#34;</span>: <span class="s2">&#34;dark-site03&#34;</span>,
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;datacenterName&#34;</span>: <span class="s2">&#34;e03&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="o">}</span>,
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;datastoreSpec&#34;</span>: 
</span></span><span class="line"><span class="cl">    <span class="o">{</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;vsanSpec&#34;</span>: <span class="o">{</span>
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;failuresToTolerate&#34;</span>: 0,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;vsanDedup&#34;</span>: false,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;esaConfig&#34;</span>: <span class="o">{</span>
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;enabled&#34;</span>: <span class="nb">true</span>
</span></span><span class="line"><span class="cl">        <span class="o">}</span>,
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;datastoreName&#34;</span>: <span class="s2">&#34;vsanDatastore&#34;</span>
</span></span><span class="line"><span class="cl">        <span class="o">}</span>
</span></span><span class="line"><span class="cl">    <span class="o">}</span>,
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;nsxtSpec&#34;</span>: <span class="o">{</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;nsxtManagerSize&#34;</span>: <span class="s2">&#34;medium&#34;</span>,
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;nsxtManagers&#34;</span>: <span class="o">[</span>
</span></span><span class="line"><span class="cl">            <span class="o">{</span>
</span></span><span class="line"><span class="cl">                <span class="s2">&#34;hostname&#34;</span>: <span class="s2">&#34;vcf09-e03-nsxa.lab.vcf&#34;</span>
</span></span><span class="line"><span class="cl">            <span class="o">}</span>
</span></span><span class="line"><span class="cl">        <span class="o">]</span>,
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;vipFqdn&#34;</span>: <span class="s2">&#34;vcf09-e03-nsx.lab.vcf&#34;</span>,
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;useExistingDeployment&#34;</span>: false,
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;nsxtAdminPassword&#34;</span>: <span class="s2">&#34;xxx&#34;</span>,
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;nsxtAuditPassword&#34;</span>: <span class="s2">&#34;xxx&#34;</span>,
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;rootNsxtManagerPassword&#34;</span>: <span class="s2">&#34;xxx&#34;</span>,
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;skipNsxOverlayOverManagementNetwork&#34;</span>: true,
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;ipAddressPoolSpec&#34;</span>: <span class="o">{</span>
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;name&#34;</span>: <span class="s2">&#34;host-overlay&#34;</span>,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;description&#34;</span>: <span class="s2">&#34;&#34;</span>,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;subnets&#34;</span>: <span class="o">[</span>
</span></span><span class="line"><span class="cl">                <span class="o">{</span>
</span></span><span class="line"><span class="cl">                    <span class="s2">&#34;cidr&#34;</span>: <span class="s2">&#34;10.28.24.0/24&#34;</span>,
</span></span><span class="line"><span class="cl">                    <span class="s2">&#34;gateway&#34;</span>: <span class="s2">&#34;10.28.24.1&#34;</span>,
</span></span><span class="line"><span class="cl">                    <span class="s2">&#34;ipAddressPoolRanges&#34;</span>: <span class="o">[</span>
</span></span><span class="line"><span class="cl">                        <span class="o">{</span>
</span></span><span class="line"><span class="cl">                            <span class="s2">&#34;start&#34;</span>: <span class="s2">&#34;10.28.24.21&#34;</span>,
</span></span><span class="line"><span class="cl">                            <span class="s2">&#34;end&#34;</span>: <span class="s2">&#34;10.28.24.30&#34;</span>
</span></span><span class="line"><span class="cl">                        <span class="o">}</span>
</span></span><span class="line"><span class="cl">                    <span class="o">]</span>
</span></span><span class="line"><span class="cl">                <span class="o">}</span>
</span></span><span class="line"><span class="cl">            <span class="o">]</span>
</span></span><span class="line"><span class="cl">        <span class="o">}</span>,
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;transportVlanId&#34;</span>: <span class="s2">&#34;2024&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="o">}</span>,
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;vcfOperationsSpec&#34;</span>: <span class="o">{</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;nodes&#34;</span>: <span class="o">[</span>
</span></span><span class="line"><span class="cl">            <span class="o">{</span>
</span></span><span class="line"><span class="cl">                <span class="s2">&#34;hostname&#34;</span>: <span class="s2">&#34;vcf09-ops.lab.vcf&#34;</span>,
</span></span><span class="line"><span class="cl">                <span class="s2">&#34;rootUserPassword&#34;</span>: <span class="s2">&#34;xxx&#34;</span>,
</span></span><span class="line"><span class="cl">                <span class="s2">&#34;type&#34;</span>: <span class="s2">&#34;master&#34;</span>,
</span></span><span class="line"><span class="cl">                <span class="s2">&#34;sslThumbprint&#34;</span>: <span class="s2">&#34;XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX&#34;</span>
</span></span><span class="line"><span class="cl">            <span class="o">}</span>
</span></span><span class="line"><span class="cl">        <span class="o">]</span>,
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;adminUserPassword&#34;</span>: <span class="s2">&#34;xxx&#34;</span>,
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;applianceSize&#34;</span>: <span class="s2">&#34;xsmall&#34;</span>,
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;useExistingDeployment&#34;</span>: true,
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;loadBalancerFqdn&#34;</span>: null
</span></span><span class="line"><span class="cl">    <span class="o">}</span>,
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;vcfOperationsFleetManagementSpec&#34;</span>: <span class="o">{</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;hostname&#34;</span>: <span class="s2">&#34;fleet.lab.vcf&#34;</span>,
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;rootUserPassword&#34;</span>: <span class="s2">&#34;xxx&#34;</span>,
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;adminUserPassword&#34;</span>: <span class="s2">&#34;xxx&#34;</span>,
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;useExistingDeployment&#34;</span>: true,
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;sslThumbprint&#34;</span>: <span class="s2">&#34;XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="o">}</span>,
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;vcfOperationsCollectorSpec&#34;</span>: <span class="o">{</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;hostname&#34;</span>: <span class="s2">&#34;vcf09-e03-ops.lab.vcf&#34;</span>,
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;applianceSize&#34;</span>: <span class="s2">&#34;small&#34;</span>,
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;rootUserPassword&#34;</span>: <span class="s2">&#34;xxx&#34;</span>,
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;useExistingDeployment&#34;</span>: <span class="nb">false</span>
</span></span><span class="line"><span class="cl">    <span class="o">}</span>,
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;hostSpecs&#34;</span>: <span class="o">[</span>
</span></span><span class="line"><span class="cl">        <span class="o">{</span>
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;hostname&#34;</span>: <span class="s2">&#34;vcf09-e03-esx01.lab.vcf&#34;</span>,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;credentials&#34;</span>: <span class="o">{</span>
</span></span><span class="line"><span class="cl">                <span class="s2">&#34;username&#34;</span>: <span class="s2">&#34;root&#34;</span>,
</span></span><span class="line"><span class="cl">                <span class="s2">&#34;password&#34;</span>: <span class="s2">&#34;xxx&#34;</span>
</span></span><span class="line"><span class="cl">            <span class="o">}</span>,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;sslThumbprint&#34;</span>: <span class="s2">&#34;XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX&#34;</span>
</span></span><span class="line"><span class="cl">        <span class="o">}</span>
</span></span><span class="line"><span class="cl">    <span class="o">]</span>,
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;networkSpecs&#34;</span>: <span class="o">[</span>
</span></span><span class="line"><span class="cl">        <span class="o">{</span>
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;networkType&#34;</span>: <span class="s2">&#34;MANAGEMENT&#34;</span>,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;subnet&#34;</span>: <span class="s2">&#34;10.28.22.0/24&#34;</span>,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;gateway&#34;</span>: <span class="s2">&#34;10.28.22.1&#34;</span>,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;subnetMask&#34;</span>: null,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;includeIpAddress&#34;</span>: null,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;includeIpAddressRanges&#34;</span>: null,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;vlanId&#34;</span>: <span class="s2">&#34;2022&#34;</span>,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;mtu&#34;</span>: <span class="s2">&#34;1500&#34;</span>,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;teamingPolicy&#34;</span>: <span class="s2">&#34;loadbalance_loadbased&#34;</span>,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;activeUplinks&#34;</span>: <span class="o">[</span>
</span></span><span class="line"><span class="cl">                <span class="s2">&#34;uplink1&#34;</span>,
</span></span><span class="line"><span class="cl">                <span class="s2">&#34;uplink2&#34;</span>
</span></span><span class="line"><span class="cl">            <span class="o">]</span>,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;standbyUplinks&#34;</span>: null,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;portGroupKey&#34;</span>: <span class="s2">&#34;e03-cl01-vds01-pg-esx-mgmt&#34;</span>
</span></span><span class="line"><span class="cl">        <span class="o">}</span>,
</span></span><span class="line"><span class="cl">        <span class="o">{</span>
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;networkType&#34;</span>: <span class="s2">&#34;VM_MANAGEMENT&#34;</span>,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;subnet&#34;</span>: <span class="s2">&#34;10.28.13.0/24&#34;</span>,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;gateway&#34;</span>: <span class="s2">&#34;10.28.13.1&#34;</span>,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;subnetMask&#34;</span>: null,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;includeIpAddress&#34;</span>: null,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;includeIpAddressRanges&#34;</span>: null,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;vlanId&#34;</span>: <span class="s2">&#34;2013&#34;</span>,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;mtu&#34;</span>: <span class="s2">&#34;1500&#34;</span>,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;teamingPolicy&#34;</span>: <span class="s2">&#34;loadbalance_loadbased&#34;</span>,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;activeUplinks&#34;</span>: <span class="o">[</span>
</span></span><span class="line"><span class="cl">                <span class="s2">&#34;uplink1&#34;</span>,
</span></span><span class="line"><span class="cl">                <span class="s2">&#34;uplink2&#34;</span>
</span></span><span class="line"><span class="cl">            <span class="o">]</span>,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;standbyUplinks&#34;</span>: null,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;portGroupKey&#34;</span>: <span class="s2">&#34;e03-cl01-vds01-pg-vm-mgmt&#34;</span>
</span></span><span class="line"><span class="cl">        <span class="o">}</span>,
</span></span><span class="line"><span class="cl">        <span class="o">{</span>
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;networkType&#34;</span>: <span class="s2">&#34;VMOTION&#34;</span>,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;subnet&#34;</span>: <span class="s2">&#34;10.28.23.0/24&#34;</span>,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;gateway&#34;</span>: <span class="s2">&#34;10.28.23.1&#34;</span>,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;subnetMask&#34;</span>: null,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;includeIpAddress&#34;</span>: null,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;includeIpAddressRanges&#34;</span>: <span class="o">[</span>
</span></span><span class="line"><span class="cl">                <span class="o">{</span>
</span></span><span class="line"><span class="cl">                    <span class="s2">&#34;startIpAddress&#34;</span>: <span class="s2">&#34;10.28.23.20&#34;</span>,
</span></span><span class="line"><span class="cl">                    <span class="s2">&#34;endIpAddress&#34;</span>: <span class="s2">&#34;10.28.23.29&#34;</span>
</span></span><span class="line"><span class="cl">                <span class="o">}</span>
</span></span><span class="line"><span class="cl">            <span class="o">]</span>,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;vlanId&#34;</span>: <span class="s2">&#34;2023&#34;</span>,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;mtu&#34;</span>: <span class="s2">&#34;1500&#34;</span>,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;teamingPolicy&#34;</span>: <span class="s2">&#34;loadbalance_loadbased&#34;</span>,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;activeUplinks&#34;</span>: <span class="o">[</span>
</span></span><span class="line"><span class="cl">                <span class="s2">&#34;uplink1&#34;</span>,
</span></span><span class="line"><span class="cl">                <span class="s2">&#34;uplink2&#34;</span>
</span></span><span class="line"><span class="cl">            <span class="o">]</span>,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;standbyUplinks&#34;</span>: null,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;portGroupKey&#34;</span>: <span class="s2">&#34;e03-cl01-vds01-pg-vmotion&#34;</span>
</span></span><span class="line"><span class="cl">        <span class="o">}</span>,
</span></span><span class="line"><span class="cl">        <span class="o">{</span>
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;networkType&#34;</span>: <span class="s2">&#34;VSAN&#34;</span>,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;subnet&#34;</span>: <span class="s2">&#34;10.28.16.0/24&#34;</span>,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;gateway&#34;</span>: <span class="s2">&#34;10.28.16.1&#34;</span>,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;subnetMask&#34;</span>: null,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;includeIpAddress&#34;</span>: null,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;includeIpAddressRanges&#34;</span>: <span class="o">[</span>
</span></span><span class="line"><span class="cl">                <span class="o">{</span>
</span></span><span class="line"><span class="cl">                    <span class="s2">&#34;startIpAddress&#34;</span>: <span class="s2">&#34;10.28.16.40&#34;</span>,
</span></span><span class="line"><span class="cl">                    <span class="s2">&#34;endIpAddress&#34;</span>: <span class="s2">&#34;10.28.16.50&#34;</span>
</span></span><span class="line"><span class="cl">                <span class="o">}</span>
</span></span><span class="line"><span class="cl">            <span class="o">]</span>,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;vlanId&#34;</span>: <span class="s2">&#34;2016&#34;</span>,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;mtu&#34;</span>: <span class="s2">&#34;9000&#34;</span>,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;teamingPolicy&#34;</span>: <span class="s2">&#34;loadbalance_loadbased&#34;</span>,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;activeUplinks&#34;</span>: <span class="o">[</span>
</span></span><span class="line"><span class="cl">                <span class="s2">&#34;uplink1&#34;</span>,
</span></span><span class="line"><span class="cl">                <span class="s2">&#34;uplink2&#34;</span>
</span></span><span class="line"><span class="cl">            <span class="o">]</span>,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;standbyUplinks&#34;</span>: null,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;portGroupKey&#34;</span>: <span class="s2">&#34;e03-cl01-vds01-pg-vsan&#34;</span>
</span></span><span class="line"><span class="cl">        <span class="o">}</span>
</span></span><span class="line"><span class="cl">    <span class="o">]</span>,
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;dvsSpecs&#34;</span>: <span class="o">[</span>
</span></span><span class="line"><span class="cl">        <span class="o">{</span>
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;dvsName&#34;</span>: <span class="s2">&#34;e03-cl02-vds01&#34;</span>,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;networks&#34;</span>: <span class="o">[</span>
</span></span><span class="line"><span class="cl">                <span class="s2">&#34;MANAGEMENT&#34;</span>,
</span></span><span class="line"><span class="cl">                <span class="s2">&#34;VM_MANAGEMENT&#34;</span>,
</span></span><span class="line"><span class="cl">                <span class="s2">&#34;VMOTION&#34;</span>,
</span></span><span class="line"><span class="cl">                <span class="s2">&#34;VSAN&#34;</span>
</span></span><span class="line"><span class="cl">            <span class="o">]</span>,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;mtu&#34;</span>: 9000,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;nsxtSwitchConfig&#34;</span>: <span class="o">{</span>
</span></span><span class="line"><span class="cl">                <span class="s2">&#34;transportZones&#34;</span>: <span class="o">[</span>
</span></span><span class="line"><span class="cl">                    <span class="o">{</span>
</span></span><span class="line"><span class="cl">                        <span class="s2">&#34;transportType&#34;</span>: <span class="s2">&#34;OVERLAY&#34;</span>,
</span></span><span class="line"><span class="cl">                        <span class="s2">&#34;name&#34;</span>: <span class="s2">&#34;VCF-Created-Overlay-Zone&#34;</span>
</span></span><span class="line"><span class="cl">                    <span class="o">}</span>
</span></span><span class="line"><span class="cl">                <span class="o">]</span>
</span></span><span class="line"><span class="cl">            <span class="o">}</span>,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;vmnicsToUplinks&#34;</span>: <span class="o">[</span>
</span></span><span class="line"><span class="cl">                <span class="o">{</span>
</span></span><span class="line"><span class="cl">                    <span class="s2">&#34;id&#34;</span>: <span class="s2">&#34;vmnic0&#34;</span>,
</span></span><span class="line"><span class="cl">                    <span class="s2">&#34;uplink&#34;</span>: <span class="s2">&#34;uplink1&#34;</span>
</span></span><span class="line"><span class="cl">                <span class="o">}</span>,
</span></span><span class="line"><span class="cl">                <span class="o">{</span>
</span></span><span class="line"><span class="cl">                    <span class="s2">&#34;id&#34;</span>: <span class="s2">&#34;vmnic1&#34;</span>,
</span></span><span class="line"><span class="cl">                    <span class="s2">&#34;uplink&#34;</span>: <span class="s2">&#34;uplink2&#34;</span>
</span></span><span class="line"><span class="cl">                <span class="o">}</span>
</span></span><span class="line"><span class="cl">            <span class="o">]</span>,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;nsxTeamings&#34;</span>: <span class="o">[</span>
</span></span><span class="line"><span class="cl">                <span class="o">{</span>
</span></span><span class="line"><span class="cl">                    <span class="s2">&#34;policy&#34;</span>: <span class="s2">&#34;LOADBALANCE_SRCID&#34;</span>,
</span></span><span class="line"><span class="cl">                    <span class="s2">&#34;activeUplinks&#34;</span>: <span class="o">[</span>
</span></span><span class="line"><span class="cl">                        <span class="s2">&#34;uplink1&#34;</span>,
</span></span><span class="line"><span class="cl">                        <span class="s2">&#34;uplink2&#34;</span>
</span></span><span class="line"><span class="cl">                    <span class="o">]</span>,
</span></span><span class="line"><span class="cl">                    <span class="s2">&#34;standByUplinks&#34;</span>: <span class="o">[]</span>
</span></span><span class="line"><span class="cl">                <span class="o">}</span>
</span></span><span class="line"><span class="cl">            <span class="o">]</span>,
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;lagSpecs&#34;</span>: null
</span></span><span class="line"><span class="cl">        <span class="o">}</span>
</span></span><span class="line"><span class="cl">    <span class="o">]</span>,
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;sddcManagerSpec&#34;</span>: <span class="o">{</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;hostname&#34;</span>: <span class="s2">&#34;vcf09-e03-sddc.lab.vcf&#34;</span>,
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;useExistingDeployment&#34;</span>: false,
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;rootPassword&#34;</span>: <span class="s2">&#34;xxx&#34;</span>,
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;sshPassword&#34;</span>: <span class="s2">&#34;xxx&#34;</span>,
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;localUserPassword&#34;</span>: <span class="s2">&#34;xxx&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="o">}</span>
</span></span><span class="line"><span class="cl"><span class="o">}</span>
</span></span></code></pre></div><p>Once the typing is done, you can simply upload the JSON file via the installer&rsquo;s GUI and start the validation.
In the best case, it should now look like this.</p>
<figure><a href="02.png"><picture><source srcset="/vcf9-darksite2/02_hu_b8079c13fb57208e.png" type="image/png">
          <img
            src="/vcf9-darksite2/02_hu_b8079c13fb57208e.png"alt="VCF Installer"width="1719"
            height="1371"/>
        </picture></a><figcaption><p>VCF Installer (click to enlarge)</p></figcaption></figure>
<p>You can ignore the DNS warning. I simply made a fat-finger mistake when deploying my vcf installer and mistyped one of the two DNS servers.
The ESX Host vSAN warning is expected, as I am using 100% unsupported consumer hardware—server vendors hate this trick. The same applies to my vSAN Network Gateway.
Since this network does not need to route, it does not have a gateway, so this message can also be ignored.
So fill up your coffee cup and press deploy.</p>
<p>As I mentioned earlier, the installation will fail and we will have to perform a final workaround. The installer will abort at the “Configure the vSphere cluster” step because a Raid 1 vSAN policy is set by default, which we obviously cannot fulfill.</p>
<h3 id="final-workaround-5-vsan-policy">(Final) Workaround 5 (vSAN Policy):</h3>
<p>The solution is super simple. Go to the Policies and Profiles settings via the burger menu in the vCenter and find the vSAN policy that the installer has set as default.
This is usually a policy with the name of the VCF instance and Optimal Datasore Default Policy -Raid1.
Lovely name. Edit this policy and set Failures to tolerate to “No data redundancy.”
I don&rsquo;t need to mention that this setting should never be used in production 99.9% of the time&hellip;</p>
<figure><a href="03.png"><picture><source srcset="/vcf9-darksite2/03_hu_d3db1a6450c81862.png" type="image/png">
          <img
            src="/vcf9-darksite2/03_hu_d3db1a6450c81862.png"alt="vSAN Policy"width="1177"
            height="956"/>
        </picture></a><figcaption><p>Dangerzone - vSAN Policy (click to enlarge)</p></figcaption></figure>
<p>After that, you can simply restart the deployment in the VCF installer using retry. The deployment should now run without any problems. For me, the whole process took about 2 hours.</p>
<figure><a href="04.png"><picture><source srcset="/vcf9-darksite2/04_hu_beee2fa0a4114af8.png" type="image/png">
          <img
            src="/vcf9-darksite2/04_hu_beee2fa0a4114af8.png"alt="VCF Installer"width="1707"
            height="1099"/>
        </picture></a><figcaption><p>VCF Installer (click to enlarge)</p></figcaption></figure>
<p>Nice, another successful deployment completed after a few workarounds, and this time really only on one host. I then deployed two more edge VMs and built my BGP peering. The Kubernetes cluster is still missing, but that&rsquo;s a story for another blog. The finished VCF instance now looks like this.</p>
<figure><a href="05.png"><picture><source srcset="/vcf9-darksite2/05_hu_3ff971829c12e3c7.png" type="image/png">
          <img
            src="/vcf9-darksite2/05_hu_3ff971829c12e3c7.png"alt="VCF Instance"width="2102"
            height="1243"/>
        </picture></a><figcaption><p>VCF Instance (click to enlarge)</p></figcaption></figure>
<h2 id="summary">Summary</h2>
<p>Significantly fewer workarounds but also more RAM required than with NFS. That&rsquo;s my feedback in a nutshell.
Perhaps I should say a few words about performance, as I am using NVMes that are unfamiliar to me in this setup.
At first glance, the Lexar EQ790s perform quite well.
They have similar read values to a Samsung 990 Pro and slightly weaker write values than the aforementioned drive, but the bottom line is that I couldn&rsquo;t get a Samsung 990 Pro at an affordable price.</p>
<figure><a href="06.png"><picture><source srcset="/vcf9-darksite2/06_hu_91c7462abe72b798.png" type="image/png">
          <img
            src="/vcf9-darksite2/06_hu_91c7462abe72b798.png"alt="Crystal Disk Mark"width="1193"
            height="792"/>
        </picture></a><figcaption><p>vSAN Crystal Disk Mark Performance (click to enlarge)</p></figcaption></figure>
<p>I also did a quick test with a Windows Server VM and Crystal Disk Mark. As always, this isn&rsquo;t a 100% accurate test, but it just to show that this setup is perfectly usable for a lab environment to run a few test VMs and Kubernetes workloads without me always having to turn on multiple servers.
So I achieved my goal 100%.
I hope I was able to give you some helpful ideas for your setups.</p>
]]></content>
		</item>
		
		<item>
			<title>VCF9 BUG: After deleting a VCF instance on VCF Operations, no VCF instance can be imported anymore</title>
			<link>https://sdn-warrior.org/posts/bug-vcf-onboarding/</link>
			<pubDate>Wed, 03 Dec 2025 20:54:20 +0100</pubDate>
			
			<guid>https://sdn-warrior.org/posts/bug-vcf-onboarding/</guid>
			<description><![CDATA[A short blog on how to deploy a Dark Site Edge in VCF9 - Part 2]]></description>
			<content type="html"><![CDATA[<h2 id="introduction">Introduction</h2>
<p>While preparing for an article that has not yet been published, I ran into an interesting VCF bug.
For the article, I had to delete a VCF instance from my operations. For background information, I currently have three VCF instances onboarded in my fleet.
Since I had to reinstall one of them, I removed it according to the official <a href="https://techdocs.broadcom.com/us/en/vmware-cis/vcf/vcf-9-0-and-later/9-0/fleet-management/adding-an-existing-vcf-instance-to-a-vcf-fleet.html">instructions</a>, and that&rsquo;s when the problems started.</p>
<h2 id="disclaimer">Disclaimer</h2>

    <aside class="admonition warning">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-alert-circle">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="8" x2="12" y2="12"></line>
      <line x1="12" y1="16" x2="12.01" y2="16"></line>
   </svg></div><b>Warning</b>
        </div>
        <div class="admonition-content">I have reported the error to Broadcom, but as of now (December 3, 2025), the error and the solution have not been confirmed or acknowledged. I can reproduce and resolve the error multiple times, but I must reiterate that you do so <strong>at your own risk.</strong></div>
    </aside>
<h2 id="error-pattern-and-impact">Error pattern and impact</h2>
<p>At first glance, the VCF instance can be deleted normally and everything appears to be functioning correctly. The error only becomes apparent when the vCLM (aka Fleet Manager) is required and the old VCF instance has been deleted. Typical errors:</p>
<ul>
<li>Inventory Sync fails with the message that SDDC from instance X is not reachable</li>
<li>Deployment of a new VCF instance to the existing instance of VCF Operations fails</li>
</ul>
<figure><a href="01.png"><picture><source srcset="/bug-sddc/01_hu_8fecfa6fce911ae4.png" type="image/png">
          <img
            src="/bug-sddc/01_hu_8fecfa6fce911ae4.png"alt="Error VCF deployment"width="1720"
            height="896"/>
        </picture></a><figcaption><p>Error VCF Installer (click to enlarge)</p></figcaption></figure>
<p>The error shown in the screenshot occurs during the “Join the existing fleet management appliance” step in the VCF deployment.
When I take a closer look at the error message, I notice that an attempt is being made to log in to an SDDC Manager:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl"><span class="s2">&#34;exceptionMessage&#34;</span>:<span class="s2">&#34;Could not get APl token from SDDC Manager vcf09-e02-sddc.lab.vcf.
</span></span></span></code></pre></div><p>Coincidentally, the SDDC Manager belongs to the very VCF instance that was previously removed.
A search in Operations revealed that there were still configuration remnants in Operations.
Even after deleting these remnants, the error persisted—you don&rsquo;t want to know how many times I deployed and deleted a VCF instance in the last two days.</p>
<h2 id="troubleshooting">Troubleshooting</h2>
<p>After searching forever, I found an entry in the Fleet Manager database that explains the behavior, and that&rsquo;s also the point where we end up in the danger zone. So buckle up, take a snapshot of the Fleet Manager, and jump into the SSH session.</p>
<figure><a href="02.png"><picture><source srcset="/bug-sddc/02_hu_1d6ab8a1e62e3534.png" type="image/png">
          <img
            src="/bug-sddc/02_hu_1d6ab8a1e62e3534.png"alt="vLCM DB"width="2474"
            height="529"/>
        </picture></a><figcaption><p>SDDC in the vCLM DB (click to enlarge)</p></figcaption></figure>
<p>I must apologize for the screenshots, but I&rsquo;m glad I took them at all. What you can see in the screenshot is that the old SDDC Manager is listed in the <strong>vm_lcops_sddc_manager</strong> table.
In total, there were four SDDC Managers listed here—including the certificate in the database.
Together with the other issues I found in Operations, this was the reason why the VCF installer was blocked and my inventory sync wasn&rsquo;t working. When deleting a VCF instance, the SDDC does not appear to be deleted from the fleet database.
I had exactly the same problem when switching from the public beta 9 to the final version of VCF 9. However, I attributed this to the beta installation and therefore did not spend much time troubleshooting.</p>
<h2 id="problem-solving">Problem solving</h2>

    <aside class="admonition warning">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-alert-circle">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="8" x2="12" y2="12"></line>
      <line x1="12" y1="16" x2="12.01" y2="16"></line>
   </svg></div><b>Warning</b>
        </div>
        <div class="admonition-content">Take a snapshot of your fleet manager</div>
    </aside>
<p>Since there is no official solution yet, here is my unofficial solution to the problem.
I have tried this several times and was able to verify that it works. Would I do this in a production environment? Definitely not. You have been warned.</p>
<p>First, the integration must be deleted, just as described in the official guide.</p>
<ul>
<li>Administration-&gt; Integrations - Delete Accounts</li>
</ul>
<p>After that, the cloud proxy must be deleted, as this was overlooked.</p>
<ul>
<li>Administration-&gt;Cloud Proxies - Delete Cloud Proxy</li>
</ul>
<p>The next step is to delete the deployment target.</p>
<ul>
<li>Fleet Management -&gt; Lifecycle -&gt; Settings -&gt; Deployment Target - Delete vCenter</li>
<li>If vCenter is not deletable trigger inventory sync (doublecheck if the Integration is still deletet).</li>
</ul>
<p>At that point, I wiped my removed VCF instance.</p>
<p>To delete the SDDC entry in the Fleet Manager DB, you must connect to the appliance via SSH and the ROOT user.</p>
<p>Connect to the DB:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">su - postgres
</span></span><span class="line"><span class="cl">/opt/vmware/vpostgres/current/bin/psql -U postgres vrlcm
</span></span></code></pre></div><p>Find your SDDC Manager Entrys:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">SELECT sddcmanagerhostname,vmid FROM vm_lcops_sddc_manager<span class="p">;</span>
</span></span></code></pre></div><p>You will receive a list with the host names and the VMID.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl"><span class="nv">vrlcm</span><span class="o">=</span><span class="c1"># SELECT sddcmanagerhostname,vmid FROM vm_lcops_sddc_manager;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  sddcmanagerhostname   <span class="p">|</span>                 vmid                 
</span></span><span class="line"><span class="cl">------------------------+--------------------------------------
</span></span><span class="line"><span class="cl"> vcf09-e03-sddc.lab.vcf <span class="p">|</span> 0838e38b-ba5c-40e3-b50e-2c86f0b2cf3c
</span></span><span class="line"><span class="cl"> vcf09-e01-sddc.lab.vcf <span class="p">|</span> 06c98f38-d855-4b4a-b62c-f81fae9158e0
</span></span><span class="line"><span class="cl"> vcf09-sddc.lab.vcf     <span class="p">|</span> 52b7a8cc-7d41-4c30-ad5c-960eb6db515d
</span></span><span class="line"><span class="cl"><span class="o">(</span><span class="m">3</span> rows<span class="o">)</span>
</span></span></code></pre></div><p>Deleting the SDDC entry:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">DELETE FROM vm_lcops_sddc_manager WHERE <span class="nv">vmid</span> <span class="o">=</span> <span class="s1">&#39;xxxx&#39;</span><span class="p">;</span>
</span></span></code></pre></div><p>Reboot the fleet manager.</p>
<h2 id="summary">Summary</h2>
<p>Congratulations, we have hopefully survived the open-heart surgery. After that, I was able to simply press resume in the VCF installer and the VCF Instanc installation ran smoothly.</p>
<figure><a href="03.png"><picture><source srcset="/bug-sddc/03_hu_502148647c67f613.png" type="image/png">
          <img
            src="/bug-sddc/03_hu_502148647c67f613.png"alt="Installation finished"width="1707"
            height="1099"/>
        </picture></a><figcaption><p>Installation complete (click to enlarge)</p></figcaption></figure>
<p>Of course, I can&rsquo;t say for sure whether all hidden entries from the old SDDC have been removed from the environment. The only thing to do here is wait until the bug is officially fixed. As I already mentioned, I was able to reproduce the behavior several times. Now that this problem has been solved, I can get back to working on the article I had originally planned.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">Broadcom published an official KB article on this topic on January 7, 2026: <a href="https://knowledge.broadcom.com/external/article/424495">https://knowledge.broadcom.com/external/article/424495</a></div>
    </aside>
]]></content>
		</item>
		
		<item>
			<title>VCF9 - Dark Site Edge</title>
			<link>https://sdn-warrior.org/posts/vcf9-dark-site-edge/</link>
			<pubDate>Sat, 22 Nov 2025 00:30:00 +0100</pubDate>
			
			<guid>https://sdn-warrior.org/posts/vcf9-dark-site-edge/</guid>
			<description><![CDATA[A short blog on how to deploy a Dark Site Edge in VCF9]]></description>
			<content type="html"><![CDATA[<h2 id="introduction">Introduction</h2>
<p>I haven&rsquo;t written anything in a while, and I need to change that here with another niche topic.
I want to take a closer look at VCF9 Edge. This does not refer to the Edge cluster in NSX; no, VCF Edge is a special form of a VCF instance.
VCF offers nine complete designs for this form, and I would like to take a closer look at one special version — the so-called Dak Site Edge.</p>
<h2 id="welcome-to-the-dark-sidte">Welcome to the dark si(d)te</h2>
<p>The &ldquo;Dark Site&rdquo; has its own independent VCF instance, separate from the primary VCF Fleet.
A dark site has limited or no external network connectivity. An independent VCF instance ensures full functionality and management within the isolated environment without relying on external services.
Okay, cool, so what do I do with it now? Well, my plan is to deploy a VCF9 instance on a single NUC 14 with an Ultra 7 CPU. I want to run a VKS Supervisor and deploy VM workloads on the dark site using automation.
The cool thing is that Dark Site Edge can also be used without the fleet. So I have a VCF instance with VKS, NSX, and vCenter in a case measuring approximately 0.7 liters, or for our friends who don&rsquo;t use the metric system, with the volume of  ¾ of a standard coffee cup.
(The volume specifications come from an AI; unfortunately, I can&rsquo;t work with the non-metric system.)</p>
<h2 id="prerequisites">Prerequisites</h2>
<p>In order to set everything up, I first need my NUC with memory tiering. This will initially be loaded with ESX 9.0.1.
If you want to know which workarounds are needed for an Intel NUC, you can check <a href="https://sdn-warrior.org/posts/nuc">here</a>.</p>
<p>Furthermore, an existing fleet is required.
Even if the VCF instance works without a permanent connection to the fleet, the fleet is still required for licensing, and therefore the instance must also be in contact at least every 180 days.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">There is also another variant without connection, but then you have a local fleet on the edge, and so it is more of a licensing distinction from the normal VCF than a functional distinction.</div>
    </aside>
<p>In addition, an offline repository is required, as the software must be obtained from somewhere. In our scenario, we assume that I do not have a permanent or stable connection to the fleet.</p>
<h2 id="design">Design</h2>
<p>In terms of design, the setup is intended to complement my VCF 9 setup as described <a href="https://sdn-warrior.org/posts/vcf9-ms-a2-special/">here</a>.</p>
<figure><a href="01.png"><picture><source srcset="/vcf9-darksite/01_hu_aef9259605a0916d.png" type="image/png">
          <img
            src="/vcf9-darksite/01_hu_aef9259605a0916d.png"alt="Dark Site Design"width="1796"
            height="1423"/>
        </picture></a><figcaption><p>Dark Site Design (click to enlarge)</p></figcaption></figure>
<ul>
<li>
<p>Local Management Domain:</p>
<p>VCF Instance 2 includes its own Management Domain components (VCF Operation Collector, SDDC Manager, vCenter and NSX).
Essential for the local management of the compute, network, and storage resources within the dark site&rsquo;s VCF instance. It provides the necessary control plane for the dark site&rsquo;s infrastructure.</p>
</li>
<li>
<p>Local Compute Cluster:</p>
<p>The VCF instance is set up as a single node cluster and thus shares resources for management and workload.</p>
</li>
<li>
<p>VCF Operations Collector:</p>
<p>Data collected locally during offline mode and synchronized with VCF operations, when connectivity is established with central site. here might be periodically exported or analyzed locally.</p>
</li>
</ul>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">We wouldn&rsquo;t be here on SDN-Warrior.org if workarounds weren&rsquo;t needed again, because the original design isn&rsquo;t actually intended to run on a single host.
But more on that later.</div>
    </aside>
<h2 id="usecase">Usecase</h2>
<ul>
<li>Strict Isolation and Security Requirements</li>
<li>Unreliable or Intermittent Network Connectivity</li>
<li>Autonomous Operations Requirement - Situations where the edge location needs to function independently for extended periods without external dependencies, including management, monitoring, and software updates.</li>
</ul>
<p>At least, those are Broadcom&rsquo;s official best-fit scenarios. Personally, I have other use case.
Since the dark site comes with its own NSX instance, I can also test things in NSX without turning on my MS-A2, saving around 150 watts.</p>
<h2 id="okayyyyyyyy-lets-go">Okayyyyyyyy, let&rsquo;s go!</h2>
<p>The whole thing can&rsquo;t be that difficult, I thought, and naively set to work, and what can I say—I had no idea what wouldn&rsquo;t work and where I would have to improvise.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">A clear warning to everyone: this is not a productive setup, and it&rsquo;s also a questionable setup. I&rsquo;m doing this for fun and because you learn the most when you think outside the box.</div>
    </aside>
<p><a href="https://williamlam.com/2025/06/deploying-vcf-9-0-on-a-single-esxi-host.html">William Lam</a> has an article online where he writes about how to deploy VCF with a single host. I tried for several hours and failed repeatedly.
Maybe I&rsquo;m too stupid, maybe I overlooked something, but in the end it doesn&rsquo;t matter, because we&rsquo;re in unsupported territory here.
Slight foreshadowing—this won&rsquo;t be the last time a plan doesn&rsquo;t work out. At this point, I&rsquo;ll count the workarounds so I can get the setup to work.</p>
<h2 id="workaround-1-and-2">Workaround 1 and 2:</h2>
<p>Activating E/P cores and memory tiering. These workarounds are well described and I use them all the time.
You can find a description of the workaround in the introduction to the article.</p>
<h2 id="workaround-3">Workaround 3:</h2>
<p>Since my NUC only has only one 2.5 Gb/s network connection and the VCF installer actually requires 2x10Gb/s networking, workaround 3 is used.
The VCF Installer checks in the pre-checks whether suitable network adapters are configured; fortunately, this can be disabled.</p>
<p>To do this, log in to the VCF Installer via SSH, obtain root privileges with su, and then simply add the following to /etc/vmware/vcf/domainmanager/application.properties</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;enable.speed.of.physical.nics.validation=false&#34;</span> &gt;&gt; /etc/vmware/vcf/domainmanager/application.properties
</span></span></code></pre></div><p>Next, restart the service again and that&rsquo;s it.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s1">&#39;y&#39;</span> <span class="p">|</span> /opt/vmware/vcf/operationsmanager/scripts/cli/sddcmanager_restart_services.sh
</span></span></code></pre></div><h2 id="workaround-35">Workaround 3.5</h2>
<p>This is not a software workaround but a hardware workaround/
Since I use NFS storage, the VCF installer attempts to mount the NFS share during the precheck.
Even if you have configured the VDS in the JSON to use a single NIC, this precheck is always performed with the second NIC.
If the server does not have a second NIC, this will always fail. I used an old 1 Gb/s USB network card for this. It has an Intel chip and was recognized without any problems.
However, the adapter does not support jumbo frames, and as soon as more than one VLAN is configured, the adapter no longer works. But that&rsquo;s enough for the precheck.</p>
<figure><a href="04.png"><picture><source srcset="/vcf9-darksite/04_hu_6825d2b3cd30c674.png" type="image/png">
          <img
            src="/vcf9-darksite/04_hu_6825d2b3cd30c674.png"alt="USB Nic"width="1280"
            height="1280"/>
        </picture></a><figcaption><p>USB Nic (click to enlarge)</p></figcaption></figure>
<p>I just had to assign the USB adapter as an uplink to the temporarily created VDS on the ESX server at the right moment, and the NFS check was successful.
The actual deployment ran via the onboard card.</p>
<p>I will realize later that all the trouble with the USB NIC was actually unnecessary.
Like many other things, but I want to write down the entire journey here.</p>
<h2 id="workaround-4">Workaround 4:</h2>
<p>My first attempt was a greenfield deployment as a single host using the workaround described by William. The interactive GUI of the VCF installer does not allow greenfield deployment with a single host. This can be suppressed with two simple entries.</p>
<p>For VCF 9.0.1, the feature properties file must be modified on the VCF installer using the following command</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;feature.vcf.vgl-29121.single.host.domain=true&#34;</span> &gt;&gt; /home/vcf/feature.properties
</span></span></code></pre></div><p>For VCF 9.0, the feature properties file must be modified on the VCF installer using the following command</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;feature.vcf.internal.single.host.domain=true&#34;</span> &gt;&gt; /home/vcf/feature.properties
</span></span></code></pre></div><p>Finally, the services on the installer must be restarted.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s1">&#39;y&#39;</span> <span class="p">|</span> /opt/vmware/vcf/operationsmanager/scripts/cli/sddcmanager_restart_services.sh
</span></span></code></pre></div><p>The workaround does not allow the interactive installer to be used; it only allows a single cluster to be deployed via a JSON file.
The deployment looked fine at first, but then stalled after deploying the SDDC Manager.</p>
<p>I found this lovely error message in the VCF Installer log.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl"><span class="s2">&#34;Failed to validate domain spec&#34;</span>, <span class="s2">&#34;nestedErrors&#34;</span>:
</span></span><span class="line"><span class="cl">K<span class="s2">&#34;errorCode&#34;</span>: <span class="s2">&#34;INVALID_NUMBER_OF_MINIMUM_HOSTS&#34;</span>, <span class="s2">&#34;arguments&#34;</span>:
</span></span><span class="line"><span class="cl"><span class="o">[</span><span class="s1">&#39;2&#34;, &#34;vLCM&#34;], message&#39;</span>:<span class="s2">&#34;Minimum 2 hosts are required for vLCM cluster with external storage to be created.&#34;</span>
</span></span></code></pre></div><p>Since it didn&rsquo;t work after several attempts, I needed a new idea. What if I built a 2-node cluster? The problem was that I didn&rsquo;t have a second NUC available, and everything was supposed to run on one NUC. Then I had the idea of building a nested host (on one of my MS-01s), deploying VCF, and then turning off the nested host.
Nice idea, but it leads me to</p>
<h2 id="workaround-5">Workaround 5:</h2>
<p>Since VCF does not allow you to create a cluster when different manufacturers are involved (in my case, ASUTEK NUC and Minisforum MS-01), I had to find a way around this.
Fortunately, it is possible to rewrite SMBIOS hardware strings.
I found the information I needed here on William&rsquo;s website. I must say that I have visited his site many times over the last few days.</p>
<p>First, you must set an ESX kernel setting so that the default hardware SMBIOS information can be overwritten.
To do this, edit the file /bootbank/boot.cfg and add the following to the line with kernelopt:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl"><span class="nv">kernelopt</span><span class="o">=</span><span class="nv">autoPartition</span><span class="o">=</span>FALSE <span class="nv">ignoreHwSMBIOSInfo</span><span class="o">=</span>TRUE
</span></span></code></pre></div><p>After that, the server must be rebooted.</p>
<p>Next, I use Williams Powershell Script to generate a vsish command that I can use to rewrite my ESXiSMBIOS.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">Function Generate-CustomESXiSMBIOS <span class="o">{</span>
</span></span><span class="line"><span class="cl">    param<span class="o">(</span>
</span></span><span class="line"><span class="cl">        <span class="o">[</span>Parameter<span class="o">(</span><span class="nv">Mandatory</span><span class="o">=</span><span class="nv">$true</span><span class="o">)][</span>String<span class="o">]</span><span class="nv">$Uuid</span>,
</span></span><span class="line"><span class="cl">        <span class="o">[</span>Parameter<span class="o">(</span><span class="nv">Mandatory</span><span class="o">=</span><span class="nv">$true</span><span class="o">)][</span>String<span class="o">]</span><span class="nv">$Model</span>,
</span></span><span class="line"><span class="cl">        <span class="o">[</span>Parameter<span class="o">(</span><span class="nv">Mandatory</span><span class="o">=</span><span class="nv">$true</span><span class="o">)][</span>String<span class="o">]</span><span class="nv">$Vendor</span>,
</span></span><span class="line"><span class="cl">        <span class="o">[</span>Parameter<span class="o">(</span><span class="nv">Mandatory</span><span class="o">=</span><span class="nv">$true</span><span class="o">)][</span>String<span class="o">]</span><span class="nv">$Serial</span>,
</span></span><span class="line"><span class="cl">        <span class="o">[</span>Parameter<span class="o">(</span><span class="nv">Mandatory</span><span class="o">=</span><span class="nv">$true</span><span class="o">)][</span>String<span class="o">]</span><span class="nv">$SKU</span>,
</span></span><span class="line"><span class="cl">        <span class="o">[</span>Parameter<span class="o">(</span><span class="nv">Mandatory</span><span class="o">=</span><span class="nv">$true</span><span class="o">)][</span>String<span class="o">]</span><span class="nv">$Family</span>
</span></span><span class="line"><span class="cl">    <span class="o">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="nv">$guid</span> <span class="o">=</span> <span class="o">[</span>Guid<span class="o">]</span><span class="nv">$Uuid</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="nv">$guidBytes</span> <span class="o">=</span> <span class="nv">$guid</span>.ToByteArray<span class="o">()</span>
</span></span><span class="line"><span class="cl">    <span class="nv">$decimalPairs</span> <span class="o">=</span> foreach <span class="o">(</span><span class="nv">$byte</span> in <span class="nv">$guidBytes</span><span class="o">)</span> <span class="o">{</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;{0:D2}&#34;</span> -f <span class="nv">$byte</span>
</span></span><span class="line"><span class="cl">    <span class="o">}</span>
</span></span><span class="line"><span class="cl">    <span class="nv">$uuidPairs</span> <span class="o">=</span> <span class="nv">$decimalPairs</span> -join <span class="s1">&#39;, &#39;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    Write-Host -ForegroundColor Yellow <span class="s2">&#34;`nvsish -e set /hardware/bios/dmiInfo {\`&#34;</span><span class="si">${</span><span class="nv">Model</span><span class="si">}</span><span class="se">\`</span><span class="s2">&#34;, \`&#34;</span><span class="si">${</span><span class="nv">Vendor</span><span class="si">}</span><span class="se">\`</span><span class="s2">&#34;, \`&#34;</span><span class="si">${</span><span class="nv">Serial</span><span class="si">}</span><span class="se">\`</span><span class="s2">&#34;, [</span><span class="si">${</span><span class="nv">uuidPairs</span><span class="si">}</span><span class="s2">], \`&#34;</span>1.0.0<span class="se">\`</span><span class="s2">&#34;, 6, \`&#34;</span><span class="nv">SKU</span><span class="o">=</span><span class="si">${</span><span class="nv">SKU</span><span class="si">}</span><span class="se">\`</span><span class="s2">&#34;, \`&#34;</span><span class="si">${</span><span class="nv">Family</span><span class="si">}</span><span class="se">\`</span><span class="s2">&#34;}`n&#34;</span>
</span></span><span class="line"><span class="cl"><span class="o">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">Generate-CustomESXiSMBIOS -Uuid <span class="s2">&#34;43f32ef6-a3a8-44cb-9137-31cb4c6c520b&#34;</span> -Model <span class="s2">&#34;SDN-Warrior 3000&#34;</span> -Vendor <span class="s2">&#34;SDN-Warrior&#34;</span> -Serial <span class="s2">&#34;0816&#34;</span> -SKU <span class="s2">&#34;NUC&#34;</span> -Family <span class="s2">&#34;sdn-warrior.org&#34;</span>
</span></span></code></pre></div><p>The UUID and serial number should be unique. As a result, I now receive a string that I can simply send to the ESX server via SSH</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">vsish -e <span class="nb">set</span> /hardware/bios/dmiInfo <span class="o">{</span><span class="se">\&#34;</span>SDN-Warrior 3000<span class="se">\&#34;</span>, <span class="se">\&#34;</span>SDN-Warrior<span class="se">\&#34;</span>, <span class="se">\&#34;</span>0815<span class="se">\&#34;</span>, <span class="o">[</span>246, 46, 243, 67, 168, 163, 203, 68, 145, 55, 49, 203, 76, 108, 82, 10<span class="o">]</span>, <span class="se">\&#34;</span>1.0.0<span class="se">\&#34;</span>, 6, <span class="se">\&#34;</span><span class="nv">SKU</span><span class="o">=</span>NUC<span class="se">\&#34;</span>, <span class="se">\&#34;</span>sdn-warrior.org<span class="se">\&#34;</span><span class="o">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">/etc/init.d/hostd restart
</span></span></code></pre></div><p>However, I have set the option to permanent, and for this to work, you simply need to insert the two lines into /etc/rc.local.d/local.sh.
After that, ESXiSMBBios will be set after every reboot.</p>
<figure><a href="02.png"><picture><source srcset="/vcf9-darksite/02_hu_283c9e2e7339dc63.png" type="image/png">
          <img
            src="/vcf9-darksite/02_hu_283c9e2e7339dc63.png"alt="Make your own Vendor"width="1144"
            height="806"/>
        </picture></a><figcaption><p>Make your own Vendor (click to enlarge)</p></figcaption></figure>
<p>Congratulations, I just became my own hardware manufacturer. May I introduce the SDN Warrior 3000!</p>
<figure><a href="03.png"><picture><source srcset="/vcf9-darksite/03_hu_1333521006f3778b.png" type="image/png">
          <img
            src="/vcf9-darksite/03_hu_1333521006f3778b.png"alt="SDN-Warrior 3000"width="1700"
            height="982"/>
        </picture></a><figcaption><p>SDN-Warrior 3000 (click to enlarge)</p></figcaption></figure>
<p>But now the deployment should work, right? - NOPE!</p>
<p>I hadn&rsquo;t thought about the network. VCF actually requires two network adapters for good reason.
The initial setup of ESX is with a standard switch and, after the SDDC has been deployed and the cluster has been formed, it is migrated to a VDS that is initially formed with the second network card.
Of course, this is not possible without a second network card.
That was also the moment when I decided against a greenfield deployment and switched to Converge.</p>
<h2 id="redesign-and-converge">Redesign and Converge</h2>
<p>Actually, I&rsquo;ve long since reached the point where the project should be considered a failure, but I really wanted a VCF cluster on a NUC.
To make that work, however, I have to migrate to a VDS, otherwise Converge will fail.
First, I manually deployed a vCenter and created my vSphere cluster with my two SDN Warrior 3000 servers (a NUC and a nested host—a dream team).</p>
<p>As far as migration to the VDS is concerned, it&rsquo;s a chicken-and-egg problem. The VDS does offer the option of migrating all VMK adapters and the VM network at once, but as soon as I do that, there is a network interruption and shortly afterwards the vSphere protection mechanism kicks in and rolls back the configuration.
Fortunately, you can disable this,  which brings me to</p>
<h2 id="workaround-6">Workaround 6</h2>
<p>To disable this behavior, you can disable the function in vCenter under Advanced Settings. To do this, set config.vpxd.network.rollback to false.</p>
<figure><a href="05.png"><picture><source srcset="/vcf9-darksite/05_hu_790054d5b2dbe7f2.png" type="image/png">
          <img
            src="/vcf9-darksite/05_hu_790054d5b2dbe7f2.png"alt="vCenter"width="1060"
            height="666"/>
        </picture></a><figcaption><p>vCenter Advanced Settings (click to enlarge)</p></figcaption></figure>
<p>Half of the work is done. At least I could now migrate the VMK to the VDS, just like the VMK for NFS.
However, my vCenter is then suspended in a vacuum on the standard switch and can no longer be connected to another port group.
To make this possible directly via the ESX server, you have to change the port group where the vCenter is to be migrated from static binding to ephemeral.
This allows the vCenter to be migrated directly via the ESX to the port group on the DVS and is then accessible again. Wonderful, the migration to the DVS with an adapter was successful.
Perhaps just a quick note: the distributed port group should be reconfigured before the vCenter becomes unavailable.</p>
<figure><a href="06.png"><picture><source srcset="/vcf9-darksite/06_hu_efe4e6bf51b337a4.png" type="image/png">
          <img
            src="/vcf9-darksite/06_hu_efe4e6bf51b337a4.png"alt="VDS"width="1281"
            height="877"/>
        </picture></a><figcaption><p>happy little VDS (click to enlarge)</p></figcaption></figure>
<p>Now the Converge could begin. Since I now had a 2-node cluster with VDS, vMotion, and so on, the Converge could begin without any further hacks via the VCF installer, right? Well, not quite.
There was still a small error, but it was easy to fix. The installer checks whether the VDS has 2 uplinks, but it does not check whether they actually have a link.
So the VDS is quickly given an uplink and lo and behold, the validation runs successfully.</p>
<figure><a href="07.png"><picture><source srcset="/vcf9-darksite/07_hu_3877ed2eabd084df.png" type="image/png">
          <img
            src="/vcf9-darksite/07_hu_3877ed2eabd084df.png"alt="precheck"width="1357"
            height="694"/>
        </picture></a><figcaption><p>Converge precheck (click to enlarge)</p></figcaption></figure>
<h2 id="converge">Converge</h2>
<p>The actual Converge process is relatively unspectacular. Broadcom and VMware have done a lot to make it as easy as possible (unless you have an adventurous setup like mine).
You no longer need Python scripts, as was the case with VCF 5.x for brownfield imports.
The process is relatively simple: the VCF installer first deploys the SDDC, then converts the existing vCenter to the new VCF instance, rolls out NSX and a cloud proxy, and then joins the instance to Fleet and VCF Operations.
It can be that simple.</p>
<figure><a href="08.png"><picture><source srcset="/vcf9-darksite/08_hu_cf6205d3c09f2498.png" type="image/png">
          <img
            src="/vcf9-darksite/08_hu_cf6205d3c09f2498.png"alt="Converge"width="1717"
            height="1000"/>
        </picture></a><figcaption><p>Converge done! (click to enlarge)</p></figcaption></figure>
<h2 id="are-you-lying-to-us">Are you lying to us?</h2>
<p>One might ask this question, because the plan was to build a single node cluster, and when it comes to that, I can only say mission successfully failed, but there is a but.
The virtual ESX is only a temporary solution, and in future I will only activate it for the lifecycle, as the NUC&rsquo;s performance is more than sufficient. I also simply removed the ESX server from the cluster and shut it down.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">At this point, I would like to reiterate my warning: do not simply remove servers from your cluster in a production environment from your VCF setup. This can lead to many sleepless nights. This is for entertainment purposes only.</div>
    </aside>
<figure><a href="09.png"><picture><source srcset="/vcf9-darksite/09_hu_b90396f97fad862f.png" type="image/png">
          <img
            src="/vcf9-darksite/09_hu_b90396f97fad862f.png"alt="single node cluster"width="1494"
            height="988"/>
        </picture></a><figcaption><p>Single node cluster (click to enlarge)</p></figcaption></figure>
<p>So, technically speaking, I cheated a little bit, but as you can see in the screenshot, I deployed a working VCF instance. What&rsquo;s still missing are the NSX Edge and my supervisor cluster, but looking at the clock, that&rsquo;s a topic for another blog article.</p>
<h2 id="just-one-more-thing">just one more thing</h2>
<p>To ensure that I still have logs and metrics from my cluster in operation even when the cluster is disconnected or my operations are down, I have enabled data persistence on the cloud proxy.
This means that logs are stored until the environment reaches my fleet again.</p>
<h2 id="final-remarks-and-summary">Final remarks and summary</h2>
<p>It was an exciting and thrilling project that definitely cost me a lot of sleep, as I worked tirelessly for the last three days and definitely didn&rsquo;t get enough sleep.
However, I couldn&rsquo;t accept that it wasn&rsquo;t possible or that I couldn&rsquo;t do it. It also gave me back the motivation I had been lacking to write something again.
I must also say that I learned a lot. And as always, I have to express my gratitude to William, because at times I was on the verge of despair.</p>
<figure><a href="10.png"><picture><source srcset="/vcf9-darksite/10_hu_985c8572f588221d.png" type="image/png">
          <img
            src="/vcf9-darksite/10_hu_985c8572f588221d.png"alt="VCF Fleet"width="1711"
            height="1124"/>
        </picture></a><figcaption><p>VCF Fleet (click to enlarge)</p></figcaption></figure>
<p>I hope you enjoyed the article and maybe it will help some of you, or you can laugh at me and my journey through the valley of tears.</p>
]]></content>
		</item>
		
		<item>
			<title>VCF9 - Full-blown VCF 9 on 2 MS-A2 PCs</title>
			<link>https://sdn-warrior.org/posts/vcf9-ms-a2-special/</link>
			<pubDate>Fri, 24 Oct 2025 21:00:00 +0200</pubDate>
			
			<guid>https://sdn-warrior.org/posts/vcf9-ms-a2-special/</guid>
			<description><![CDATA[A short article containing all the workarounds for rolling out a complete VCF9 on 2 MS-A2s.]]></description>
			<content type="html"><![CDATA[<h2 id="introduction">Introduction</h2>
<p>In the past, I built all my labs in a nested structure, which worked well, but now the demands on me and my lab are increasing because I want more.
I want to try more, even in areas I previously neglected. The idea of a redesign took shape in my head, and I knew there had to be more possibilities with my hardware.
Above all, I wanted to work with VCF Automation, because it is essential for multi-tenancy environments and replaces vCloud Director.
And let&rsquo;s be honest, automation is a resource hog.</p>
<p>The idea was quite simple: I would take my two MS-A2 servers and install a VCF management domain on them.
Both servers have 128 GB of RAM—that should be enough, right?
Unfortunately not really, but there is still memory tiering.
And that could be the end of the blog article—could be, if it weren&rsquo;t for AMD Ryzen.
Fortunately, there are workarounds, and thanks to <a href="https://williamlam.com/">William Lam</a>, I found them.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">Most of the workarounds are not mine, and most has already been described in other blogs. I just want to show my new setup here and how I built it for myself.
The lab does not claim to run on minimal resources, but rather to fulfill my use cases and help others build something similar.</div>
    </aside>
<p>But let&rsquo;s take it slowly, one step at a time.</p>
<h2 id="hardware-setup">Hardware Setup</h2>
<p>My plan is to build a two-node cluster without vSAN and with memory tiering. NFS will be provided via my good old Unraid storage system.
I also started directly with VCF 9.0.1 as a greenfield deployment. I will continue to build my workload domains nested.
This allows me to set up a complete VCF9 setup with three physical hosts, including VCF Operations for Network, Operations for Logs, and Automation.</p>
<figure><a href="01.png"><picture><source srcset="/vcf9-special/01_hu_1038be1db341ef39.png" type="image/png">
          <img
            src="/vcf9-special/01_hu_1038be1db341ef39.png"alt="MS-A2"width="1512"
            height="1134"/>
        </picture></a><figcaption><p>MS-A2 (click to enlarge)</p></figcaption></figure>
<p>First, I install a 1 Tb Samsung 990 Pro in the first slot of each MS-A2.
The first slot in the MS-A2 is set to PCIe4x4 in the BIOS by default.
The other slots are set to PCIe3x4 by default. You can change this in the BIOS, but you risk the NVMes overheating.
The active cooler does not seem to be sufficient to cool all NVMes. Here is an excerpt from the official FAQ:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">To prevent the SSD from overheating, the two SSDs on the right side of the back support a 
</span></span><span class="line"><span class="cl">maximum speed of PCIE4.0x4, but are set to PCIE3.0x4 by default. You can adjust it to PCIe Gen4 
</span></span><span class="line"><span class="cl">speed in OnBoard Setting in BIOS, but it may cause the SSD to overheat, which may result in blue 
</span></span><span class="line"><span class="cl">screen, freeze, or data loss.
</span></span></code></pre></div><p>In the middle slot, I have a 2 TB NVMe for booting and as local storage. The third slot is free in case I ever want to use vSAN.</p>
<h2 id="memory-tiering-and-first-steps-after-setup-esx">Memory Tiering and first steps after setup ESX</h2>
<p>For VCF, SSH and NTP must be enabled after setting up the ESX server. It is also important that only the first 10Gb/s adapter is used.
The second must be left unconfigured, as VCF deployment involves migrating from a standard switch to a distributed switch.
Also, don&rsquo;t forget to recreate the TLS certificate, otherwise the VCF installer will throw an error when adding the hosts.</p>
<ul>
<li>Cert regeneration</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">/sbin/generate-certificates
</span></span></code></pre></div><p>To enable memory tiering, the following must now be entered in the ESX CLI.</p>
<ul>
<li>This command turns on memory tiering</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">esxcli system settings kernel set -s MemoryTiering -v TRUE
</span></span></code></pre></div><ul>
<li>This command selects the NVMe</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">esxcli system tierdevice create -d /vmfs/devices/disks/&lt;Your NVME&gt;
</span></span></code></pre></div><ul>
<li>Enter the factor here (0-400%)</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">esxcli system settings advanced set -o /Mem/TierNvmePct -i 200
</span></span></code></pre></div><p>I use a factor of 200% because I need enough reserve to be able to move everything to one host for the entire lifecycle.
But what&rsquo;s the problem? The commands are well known, and I already described them in an article from 2024.</p>
<h3 id="ryzen-workaroud-no1">Ryzen Workaroud No1.</h3>
<p>The problem lies with the AMD Ryzen processor and does not affect AMD EPYC or Intel Core, Ultra, or Xeon CPUs.
To be fair, it must be said that Ryzen is not on the compatibility list and that VMware Homelabs has often insisted on NUC in the past, with many VMware employees themselves building labs with Intel NUCs.
Whether this is the reason why Intel NUCs work so well is something I can only speculate about.
But what exactly are the problems?</p>
<p>Well, the error was a bit strange.
The VCF installer first installed vCenter and everything ran smoothly so far.
The first problems arose with the SDDC Manager, which was deployed after vCenter.
It simply refused to boot up. The VM was successfully turned on, and you could see the Photon OS splash screen, but after that, everything remained dark.
My first suspicion was my NFS, as I had had a lot of problems with the reliability of my NFS shares after an Unraid update before my vacation. However, that turned out to be a dead end.</p>
<p>I then tried to install the VCF Installer (aka SDDC Manager) manually on local storage and got the same result.
The same thing happened with NSX Manager. It boots a little further than SDDC, but also stops quite early on with no error message.</p>
<figure><a href="02.png"><picture><source srcset="/vcf9-special/02_hu_c9c47fe03a42fbb.png" type="image/png">
          <img
            src="/vcf9-special/02_hu_c9c47fe03a42fbb.png"alt="NSX Manager"width="3840"
            height="2160"/>
        </picture></a><figcaption><p>MSX Manager stops booting (click to enlarge)</p></figcaption></figure>
<p>After a long search on Google, I found the solution at <a href="https://williamlam.com/2025/06/nvme-tiering-with-amd-ryzen-cpu-workaround-for-vcf-9-0.html">William&rsquo;s blog</a>.</p>
<p>For memory tiering to work properly with AMD Ryzen, a VM advanced setting must be configured.
Of course, this is not scalable if it has to be set for each individual VM, and it also does not work with the automatic deployment process of VCF.
Fortunately, it can also be set globally. To do this, you must log in to each ESX host via SSH and execute the following command:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">echo &#39;monitor_control.disable_apichv =&#34;TRUE&#34;&#39; &gt;&gt; /etc/vmware/config
</span></span></code></pre></div><p>The workaround takes effect immediately and the ESX server does not need to be rebooted. If VMs were running, they must first be shut down for the workaround to take effect for these VMs.</p>
<h3 id="ryzen-workaroud-no2-nsx-edge-cpu-check">Ryzen Workaroud No2 (NSX Edge CPU Check).</h3>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">The workaround with the script no longer works with VCF 9.0.2 because the position of the CPU check has changed. There is a new, more effective workaround, which I describe below.</div>
    </aside>
<p>The next workaround concerns the problem with the Data Plane Development Kit (DPDK) and the lack of support for Ryzen CPUs.
I already described how to solve this problem manually in my <a href="https://sdn-warrior.org/posts/ms-a2/#wheres-the-poop-robin">MS-A2</a> test.</p>
<p>However, William has a better solution here too, and I am using his Powershell script, which he has kindly made publicly available.
Nothing has changed in terms of the actual fix; you simply comment out the CPU check and the Edge VM will then work as it should.
I also didn&rsquo;t notice any performance issues. I get 10 Gb/s north-south over my edges without any problems.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">Connect-VIServer -Server vc01.vcf.lab -User administrator@vsphere.local -Password VMware1!VMware1!
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">$edges = @(&#34;edge01a&#34;,&#34;edge01b&#34;)
</span></span><span class="line"><span class="cl">$edgeUser = &#34;root&#34;
</span></span><span class="line"><span class="cl">$edgePass = &#34;VMware1!VMware1!&#34;
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">### DO NOT EDIT BEYOND HEREx
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">$edgeScript = &#34;sed -i `&#39;/if `&#34;AMD`&#34; in vendor_info and `&#34;AMD EPYC`&#34; not in model_name:/s/^/        #/;/self.error_exit(`&#34;Unsupported CPU: %s`&#34; % model_name)/s/^/        #/`&#39; /opt/vmware/nsx-edge/bin/config.py&#34;
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">foreach ($edge in $edges) {
</span></span><span class="line"><span class="cl">    Invoke-VMScript -VM (Get-VM $edge) -ScriptText $edgeScript  -GuestUser $edgeUser -GuestPassword $edgePass
</span></span><span class="line"><span class="cl">}
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">Disconnect-VIServer * -Confirm:$false
</span></span></code></pre></div><p>The script is part of <a href="https://github.com/lamw/vmware-scripts/tree/master">Williams Script Collection</a>, which I highly recommend to everyone.</p>
<h3 id="new-ryzen-cpu-nsx-workaround-nsx-edge-cpu-check">New Ryzen CPU NSX Workaround (NSX Edge CPU Check)</h3>
<p>The improved workaround is both simple and sustainable. Instead of deactivating the CPU check in the Edge every time during deployment, you simply trick the system into thinking it has a suitable CPU. May I introduce the EPIC Ryzen processor.
To do this, you must log in to each ESX server once as root and enter the following command.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">echo &#39;cpuid.brandstring = &#34;AMD EPYC Ryzen 9 9955HX&#34;&#39; &gt;&gt; /etc/vmware/config
</span></span></code></pre></div><p>A reboot of the ESX server is not necessary, but the Edge VM must undergo a full power cycle once. Edge VMs deployed in the future will then recognize the EPIC Ryzen CPU directly, thus bypassing the CPU check in the EdgeVM script. Simple and update-safe.</p>
<h2 id="vcf-setup">VCF Setup</h2>
<p>Now we come to the exciting part. After all the workarounds have been implemented and I was able to successfully complete the deployment, the question remains as to what exactly my design looks like now. Because I think that&rsquo;s what most readers are interested in.</p>
<p>I currently have three physical hosts in use for my full-blown VCF9 setup.
Two MS-A2s form my management domain and one MS-01 is a standalone host (without memory tiering, but with P/E cores enabled – the workaround can be found <a href="https://sdn-warrior.org/posts/nuc/#using-pe-cores">here</a> and has nothing directly to do with the MS-A2).</p>
<p>Why do I need the standalone host? Quite simply, I have two nested ESX servers running on it, which form my workload domain. Since the workload domain requires significantly fewer resources (in a lab), I can easily map it with an MS-01.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">The nested workload domain is not actually necessary to test all products in the VCF9 stack, but since I also used the lab for a customer demo,
I wanted to set it up as close to reality as possible.
Because the workload domain is nested, I still have enough capacity in my lab to deploy a nested VCF instance with vSAN or another workload domain, for example.
This becomes interesting when you look at multi-tenancy with automation in a multi-instance single fleet design—something I would like to take a closer look at when I have the opportunity.</div>
    </aside>
<p>Since pictures say more than a thousand words, I have created a design drawing here. The whole thing is based on Broadcom&rsquo;s official design blueprints.</p>
<figure><a href="03.png"><picture><source srcset="/vcf9-special/03_hu_c5eca1d081a6ede.png" type="image/png">
           <img
             src="/vcf9-special/03_hu_c5eca1d081a6ede.png"alt="Design"width="4118"
             height="4809"/>
         </picture></a><figcaption><p>Lab Design (click to enlarge)</p></figcaption></figure>
<p>And here are the deployed components in the Managment Domain and the required resources:</p>
<table>
  <thead>
      <tr>
          <th>Name</th>
          <th>virt. CPUs</th>
          <th>Memory Size</th>
          <th>Provisioned Space</th>
          <th>Used Space</th>
          <th>Usage</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>vcfa-mgmt-7s24b</td>
          <td>16</td>
          <td>96 GB</td>
          <td>529 GB</td>
          <td>134 GB</td>
          <td>Automation</td>
      </tr>
      <tr>
          <td>vcf09-w01-nsxa</td>
          <td>6</td>
          <td>24 GB</td>
          <td>300 GB</td>
          <td>37 GB</td>
          <td>NSX WLD01</td>
      </tr>
      <tr>
          <td>vcf09-nsxa</td>
          <td>6</td>
          <td>24 GB</td>
          <td>300 GB</td>
          <td>46 GB</td>
          <td>NSX MGMT</td>
      </tr>
      <tr>
          <td>vcf09-w01-vcsa</td>
          <td>4</td>
          <td>21 GB</td>
          <td>941 GB</td>
          <td>50 GB</td>
          <td>VCSA WLD</td>
      </tr>
      <tr>
          <td>vcf09-ni</td>
          <td>8</td>
          <td>32 GB</td>
          <td>1000 GB</td>
          <td>63 GB</td>
          <td>Ops for Network</td>
      </tr>
      <tr>
          <td>vcf09-vcsa</td>
          <td>4</td>
          <td>21 GB</td>
          <td>742 GB</td>
          <td>50 GB</td>
          <td>VCSA MGMT</td>
      </tr>
      <tr>
          <td>vcf09-m01-edge02</td>
          <td>4</td>
          <td>8 GB</td>
          <td>197 GB</td>
          <td>24 GB</td>
          <td>Edge MGMT</td>
      </tr>
      <tr>
          <td>vcf09-m01-edge01</td>
          <td>4</td>
          <td>8 GB</td>
          <td>197 GB</td>
          <td>22 GB</td>
          <td>Edge MGMT</td>
      </tr>
      <tr>
          <td>fleet</td>
          <td>4</td>
          <td>12 GB</td>
          <td>193 GB</td>
          <td>113 GB</td>
          <td>Fleet Manager</td>
      </tr>
      <tr>
          <td>vcf09-ops</td>
          <td>4</td>
          <td>16 GB</td>
          <td>274 GB</td>
          <td>23 GB</td>
          <td>VCF Operation</td>
      </tr>
      <tr>
          <td>vcf09-li-master</td>
          <td>4</td>
          <td>8 GB</td>
          <td>530 GB</td>
          <td>182 GB</td>
          <td>Ops for Logs</td>
      </tr>
      <tr>
          <td>vcf09-sddc</td>
          <td>4</td>
          <td>16 GB</td>
          <td>914 GB</td>
          <td>74 GB</td>
          <td>SDDC</td>
      </tr>
      <tr>
          <td>vcf09-ni-col.lab</td>
          <td>4</td>
          <td>12 GB</td>
          <td>250 GB</td>
          <td>38 GB</td>
          <td>Ops for Network collector</td>
      </tr>
      <tr>
          <td>vcfopscol</td>
          <td>4</td>
          <td>16 GB</td>
          <td>264 GB</td>
          <td>19 GB</td>
          <td>Cloud Proxy</td>
      </tr>
      <tr>
          <td><strong>Summe</strong></td>
          <td><strong>76</strong></td>
          <td><strong>314 GB</strong></td>
          <td><strong>6631 GB</strong></td>
          <td><strong>875 GB</strong></td>
          <td></td>
      </tr>
  </tbody>
</table>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">The Automation Appliance is initially deployed with 24 vCPUs. After the initial start, I reduced the vCPUs to 16 and did not notice any negative impact. It may be possible to reduce this number even further.</div>
    </aside>
<h3 id="cpu-resources">CPU resources</h3>
<p>Wow, that&rsquo;s a lot of appliances and a lot of power required. First, the good news: when I turn off the automation appliance, the entire setup runs smoothly and quickly on an MS-A2 without any problems.</p>
<p>With automation turned on, the vCenter times out and many components no longer work properly.
However, without test VMs in my management domain, I have an overbooking factor of 2.4 to 1, and the actual bottleneck is neither RAM nor storage, but actually the CPU.</p>
<p>The recommendation is to overbook a management domain by a maximum of 2 to 1.
If I move the management domain to a host, I have an overbooking factor of 4.75 to 1 with automation and 3.75 to 1 without automation.
So it&rsquo;s not surprising that a host only works properly without automation.</p>
<h3 id="storage-resources">Storage resources</h3>
<p>As usual, I use an NFS 3 share from my self-built Unraid NAS, which is connected with 2x10Gb/s, for storage.
All VMs are thin deployed, and I have provided 2 shares with 4 TB NVMe storage each.
My VMs from the management domain go into the first share, and my VMs from the workload domain go into the second share.</p>
<p>Since my Unraid also has a 2x10 Gb/s network and I have 2 VLANs for NFS, both the management domain and the workload domain can access the full 10 Gb/s network performance, as they each have a different physical path to the storage and therefore cannot slow each other down.</p>
<p>The actual storage usage of less than one terabyte is actually not that much. In addition, I would also have a third 4 TB share that I could connect if space becomes scarce in the future.</p>
<h3 id="ram-resources">RAM resources</h3>
<p>My MS-A2s are each equipped with a 128 GB DDR5 kit. I also selected a factor of 200% for memory tiering.
This means that each ESX server has a theoretical 384 GB of RAM. However, I am somewhat surprised that no Tier 1 memory (memory tiering) is currently being used in my management domain.</p>
<figure><a href="04.png"><picture><source srcset="/vcf9-special/04_hu_8f581094a7176808.png" type="image/png">
           <img
             src="/vcf9-special/04_hu_8f581094a7176808.png"alt="RAM Usage"width="1622"
             height="658"/>
         </picture></a><figcaption><p>RAM usage in Managment Domain (click to enlarge)</p></figcaption></figure>
<p>Let&rsquo;s take a closer look. If you look in vCenter, you can see that I currently have an approximate Tier 0 RAM allocation of 188 GB. A large part of this comes from automation.
But even with automation, I still have physical Tier 0 RAM left over, which is why you can see in the screenshot that Tier 1 RAM is listed as -1 MB everywhere.</p>
<p>Without the operation appliance, I have a RAM utilization of 130 GB, so I don&rsquo;t really need the factor of 200%. I chose it anyway because, among other things, SSP is to be deployed in the future and SSP requires 2x16 vCPUs and 64 GB RAM.
So it&rsquo;s another little resource hog and I have more leeway to run everything on one host.
You also have to think about the future lifecycle.
In addition, another nested workload domain is to be added, which will also bring with it another vCenter.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">It is not yet certain whether SSP will ultimately enable everything to run on an MS-A2. I will continue to explore whether I can further limit the appliances. There is definitely potential for savings here. Nevertheless, the main goal must be to ensure that all components affected by the VCF lifecycle can run on one host.
This primarily affects the Fleet, NSX, and vCenter components. If the workload domain and its management components have to be switched off during the lifecycle, this is not a problem, as these are updated separately, just like automation or SSP.</div>
    </aside>
<h2 id="vcf-901-nsx-edge-workaround">VCF 9.0.1 NSX Edge Workaround</h2>
<p>My colleague <a href="https://vi-universe.github.io/">Christian</a> and I encountered a spontaneous error. If, like him or me, you don&rsquo;t have the lab running 24/7, it can happen that the uplink port groups are deleted from the Edge VM.
This should only happen if the environment is on and the port groups are 24 hours without a connected Edge VM.</p>
<p>What is the result? Well, a huge mountain of error messages.
The Edge VMs no longer have an uplink or a TEP network, meaning that all north-south communication is dead.</p>
<figure><a href="05.png"><picture><source srcset="/vcf9-special/05_hu_709dc72aba390e89.png" type="image/png">
           <img
             src="/vcf9-special/05_hu_709dc72aba390e89.png"alt="NSX Edge"width="1719"
             height="811"/>
         </picture></a><figcaption><p>NSX Edge trouble (click to enlarge)</p></figcaption></figure>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">This is a known issue with VCF 9.0.1:
When edge is created by Setup Network Connectivity UI, the System created dvpg consumed by edge gets deleted when edge is powered on after 24 hrs. The port group assigned to the NSX Edge uplink has disappeared making it impossible to use the network through NSX Edge.
You can find the release notes <a href="https://techdocs.broadcom.com/us/en/vmware-cis/vcf/vcf-9-0-and-later/9-0/release-notes/vmware-cloud-foundation-9-0-1-release-notes/nsx-9-0-1-0000.html#GUID-03f28b57-6cef-441a-acf7-1a0204a3bff2-en_id-e934abd0-6d05-4dc4-a878-64630bc97a68">here</a></div>
    </aside>
<p>But first, let&rsquo;s take a step back. In VCF 9, a separate trunk port group is created for each Fastpath interface of the Edge VM.
With two edges, each with two uplinks, you therefore have four port groups.</p>
<figure><a href="06.png"><picture><source srcset="/vcf9-special/06_hu_cd06cfb446bfce90.png" type="image/png">
          <img
            src="/vcf9-special/06_hu_cd06cfb446bfce90.png"alt="NSX Edge Uplinks"width="1078"
            height="450"/>
        </picture></a><figcaption><p>NSX Edge Uplinks (click to enlarge)</p></figcaption></figure>
<p>If you take a closer look at the port groups, you will see that they are completely normal trunk port groups.
To fix this error, you simply need to create two new trunk port groups (note the teaming policy—trunk 1 uplink 1 active/uplink 2 standby and vice versa for trunk 2).
These port groups must then be assigned as Adapter 1 and Adapter 2 in the vCenter of the Edge VM. Apparently, only Edge VMs are affected by the error if they were created via the vCenter.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">Always check the uplink profile created in NSX. In VCF 9.0, the uplink profile was created without a TEP VLAN ID, which was stored directly in the edge. With the newly created port groups, the TEP network no longer works. In VCF 9.0.1, this no longer seems to be the case; my uplink profiles all have a TEP VLAN ID. Since I no longer have a VCF 9.0 setup, I can not reproduce the problem.</div>
    </aside>
<h2 id="conclusion">Conclusion</h2>
<p>With the new setup, I am prepared for all eventualities.
I have enough space in my physical lab to create additional nested labs, and I have a complete solution lab where I can test all aspects of VCF 9. Best of all, I only need three physical hosts, which means I can get by with about 500 watts of power consumption.
My always-on equipment requires about 200 watts, bringing the total power consumption of the VCF 9 setup to 300 watts. Not bad at all, to be honest.
So, in total, I have 5 more MS-01s that I can use to test things out alongside VCF 9.</p>
]]></content>
		</item>
		
		<item>
			<title>Tales from the Lab – Lab setup</title>
			<link>https://sdn-warrior.org/posts/tales-from-the-lab-setup/</link>
			<pubDate>Fri, 12 Sep 2025 17:00:00 +0200</pubDate>
			
			<guid>https://sdn-warrior.org/posts/tales-from-the-lab-setup/</guid>
			<description><![CDATA[A brief blog about setting up my nested lab.]]></description>
			<content type="html"><![CDATA[<h2 id="introduction">Introduction</h2>
<p>I have written a lot about the actual lab setup in the past, but it was always about the hardware and not how I manage my labs or, more recently, how I have built up my vcf fleet.
I want to change that here now, as I have received an increasing number of requests for more information.</p>
<p>So let&rsquo;s jump right in. First things first, let&rsquo;s take a look at the physical lab infrastructure. That&rsquo;s my basis for everything else.</p>
<h2 id="physical-lab-infrastructure">Physical lab infrastructure</h2>
<p>I currently own eight Minisforum servers and one NUC Ultra 7.
The Minisforum servers are for workloads and are divided into two clusters, as some of the hardware has AMD CPUs and the other servers have Intel CPUs.
My NUC is its own cluster and runs my always-on workload, as the Minisforum servers generate too much power and heat for 24/7 operation.</p>
<figure><a href="01.png"><picture><source srcset="/lab-setup/01_hu_1e12ec72089f5c66.png" type="image/png">
          <img
            src="/lab-setup/01_hu_1e12ec72089f5c66.png"alt="vCenter"width="2946"
            height="1820"/>
        </picture></a><figcaption><p>Lab vCenter (click to enlarge)</p></figcaption></figure>
<p>As you can see in the image, two Minisforum servers are currently not part of my cluster for licensing reasons.
This will change in the coming weeks, but Proxmox is currently running on them for testing purposes.</p>
<h3 id="management-nuc">Management NUC</h3>
<p>This setup is the foundation for my entire lab.
Most of my fleet runs on my small NUC, including a VKS Supervisor cluster as a single deployment. Important services provided by the NUC are:</p>
<ul>
<li>Active Directory (Microsoft)</li>
<li>Veeam Backup Server</li>
<li>VCF Operations</li>
<li>VCF Operations for Logs</li>
<li>VCF Cloud Proxy</li>
<li>VCF Fleetmanager</li>
<li>Fortigate Logserver (for my Fortigate 40F)</li>
<li>VKS Supervisor with Foundation Loadbalancer</li>
<li>VCF Offline Repository</li>
<li>vCenter for the Lab</li>
<li>Netbox</li>
<li>mDNS Repeater (It&rsquo;s not relevant for the lab, but it is for my number one service owner, because without it, you can&rsquo;t turn on the TV with Siri and watch Netflix.)</li>
<li>Windows 11 Client (There are things I can&rsquo;t do with my Mac.)</li>
</ul>
<p>Of course, the NUC is pretty busy, and I&rsquo;m considering getting a second one, because the boxes don&rsquo;t cost that much (used) and don&rsquo;t take up much space.
Bonus points for the fact that every NUC is certified for 24/7 operation.</p>
<p>Alternatively, I could use memory tiering, but that would require reinstalling the little guy, and somehow I&rsquo;m too lazy to do that.
My self-built storage (which I&rsquo;ve described <a href="https://sdn-warrior.org/posts/unraid-storage/">here</a> before) runs another domain controller, my root CA and my Ansible host.<br>
I obtain NTP globally from my ToR switch, as Mikrotik, the old Swiss Army knife, can do this wonderfully.</p>
<h2 id="storage">Storage</h2>
<p>Let&rsquo;s move on to the exciting topic of storage. Here, I have a combination of local and shared storage.
Each server has a 2 TB local NVMe installed on which the hypervisor itself is located, and the rest of the NVMe is provided as VMFS storage.
On my Unraid server, I provide 4 TB as iSCSI storage and 2x 4 TB as NFS storage.
The storage on Unraid is implemented on NVMes, but without Raid. If a disk fails, I just lose a share.</p>
<figure><a href="02.png"><picture><source srcset="/lab-setup/02_hu_808cbbf288df33e5.png" type="image/png">
          <img
            src="/lab-setup/02_hu_808cbbf288df33e5.png"alt="Storage"width="2946"
            height="1820"/>
        </picture></a><figcaption><p>Lab Storage (click to enlarge)</p></figcaption></figure>
<p>Now you might ask yourself why I did it this way. Well, iSCSI has the advantage that I can implement a multi-path fabric and is basically my fastest storage, since I have 2x 10 Gb/s iSCSI and can utilize it to its full capacity.
All my servers are equipped with 2x10 Gb/s networks (except for my small management NUC).  My iSCSI storage runs VMs that require maximum storage performance.
I use NFS for VCF nested labs where I don&rsquo;t want to use vSAN, and most of the time I don&rsquo;t want to use vSAN in my lab for resource reasons. Fortunately, this is now supported without any problems with VCF9.
But why did I build two NFS shares and not just one large share?</p>
<p>Well, that&rsquo;s because my storage only supports NFS3 (NFS 4 keeps causing problems with Unraid, and VCF 9 requires NFS3 anyway if you want to use it as primary storage), which means multipathing is not possible.
For maximum performance, the first NFS share uses the first physical adapter of my storage system and the second NFS share uses the second physical adapter.
This means that the two shares do not interfere with each other. In addition, the PCIe bus to which the NVMes are connected does not exceed 10 Gb/s anyway.</p>
<p>If I want to use VCF with vSAN, I use the local storage of my hypervisors, as I have had problems with vSAN and nested ESXi on my shared storage from time to time.</p>
<h2 id="drs-and-network">DRS and Network</h2>
<p>Let&rsquo;s move on to my DRS and network settings.
This is relatively simple. My DRS is set to fully automated but in conservative mode, as I don&rsquo;t want any “unnecessary” DRS actions from nested ESX servers.
As a rule, I think about where to place my VMs beforehand and then they should stay there.</p>
<p>Since I don&rsquo;t always test nested VCF labs in my lab, DRS is generally still enabled, as there are scenarios where it is checked whether DRS is enabled or not.
For example, with VKS.</p>
<p>My network settings are currently still under construction. At the moment, I have a distributed switch for all three clusters, but last year showed that this is not the best setup when hosts are frequently down.
My new plan is to have a separate vDS for each cluster. This is more flexible and I don&rsquo;t constantly have the problem that my vDS is no longer in sync when I need a new network on my management NUC.</p>
<p>My port groups are usually active/active on both uplinks.
There are three exceptions: my trunk port groups with MacLearning.
Here, I have a trunk that only uses uplink 1 of the vDS and a trunk port group that uses uplink 2 of the vDS.
This is important for NSX Labs if you want to test traffic steering of the NSX Teaming Policy.
Another exception is my port group on which promiscuous mode is enabled. Here, I use active/standby, as otherwise I would have duplicated packets and performance losses.</p>
<figure><a href="03.png"><picture><source srcset="/lab-setup/03_hu_a602a93e6500ef42.png" type="image/png">
          <img
            src="/lab-setup/03_hu_a602a93e6500ef42.png"alt="Network"width="2946"
            height="1820"/>
        </picture></a><figcaption><p>Lab Network (click to enlarge)</p></figcaption></figure>
<p>If you want to know more about this topic, you can read my article on <a href="https://sdn-warrior.org/posts/mac-learning/">Mac Learning</a>.
Here, I address the problem and also show the differences in performance.
For VCF 9 installation, promiscuous mode is required, otherwise the deployment will fail when the VCF installer attempts to migrate the ESX servers to the distributed switch.</p>
<h2 id="vcf-fleet">VCF Fleet</h2>
<p>My fleet setup is a bit of a hodgepodge. Since I need VCF Operations for licensing, this is already running on my management NUC, as mentioned, but my VCF Automation is on an MS-A2 as a VM and is only turned on when I really need it, as it is a real resource hog. In addition, a VKS cluster has recently been running for testing purposes.
I want to use automation to automatically deploy VMs that are running in the vsphere Namespace - fancy new things.</p>
<figure><a href="04.png"><picture><source srcset="/lab-setup/04_hu_b74ef2ac2ea7eee5.png" type="image/png">
          <img
            src="/lab-setup/04_hu_b74ef2ac2ea7eee5.png"alt="VCF Operations"width="2946"
            height="1820"/>
        </picture></a><figcaption><p>VCF Operations (click to enlarge)</p></figcaption></figure>
<p>I have several VCF instances onboarded in my central VCF Operations. Since I only have a certain number of licenses, I delete Labs in VCF Operations that are not currently needed.
This is a bit annoying, but with the current licensing model, there is unfortunately no other way to do this.</p>
<p>VCF Operations for Network is currently no longer deployed because my vDS kept getting out of sync, which caused problems with IPFIX. I will address this issue when I rebuild the vDS. After that, I will try again to get netflow running on my Mikrotik so that I can get useful network insights.
However, this is still a work in progress.</p>
<figure><a href="05.png"><picture><source srcset="/lab-setup/05_hu_6df6be8da61cddc6.png" type="image/png">
          <img
            src="/lab-setup/05_hu_6df6be8da61cddc6.png"alt="VCF Operations"width="3376"
            height="1418"/>
        </picture></a><figcaption><p>VCF Operations Overview (click to enlarge)</p></figcaption></figure>
<h2 id="final-thoughts">Final thoughts</h2>
<p>My setup isn&rsquo;t particularly complicated, and everything is designed so that I can save resources and minimize 24/7 use without sacrificing too much comfort—anyone who has ever looked at the procedure for shutting down a VKS-enabled cluster knows what I mean. But not everything is perfect, and there is certainly room for improvement.
My Ansible playbooks need to be revised because they no longer run properly with the latest version. In the future, I would also like to deploy much more automatically, i.e., I would prefer to deploy my standard VCF deployments fully automatically.</p>
<p>My monitoring with uptime Kuma with Discord bot and notifications is a good start, but it was never really finished. I&rsquo;m also not quite sure where I want to go with my VCF automation setup, but I think the next few weeks and months will show. If you have any suggestions, questions, or improvements, please feel free to contact me. Homelab thrives on exchange.</p>
]]></content>
		</item>
		
		<item>
			<title>VCF9 -  VCF5.X to VCF9 upgrade</title>
			<link>https://sdn-warrior.org/posts/vcf5-to-9-upgrade/</link>
			<pubDate>Sun, 07 Sep 2025 23:00:08 +0200</pubDate>
			
			<guid>https://sdn-warrior.org/posts/vcf5-to-9-upgrade/</guid>
			<description><![CDATA[A brief guide to upgrading from VCF5.X to VCF9 on a brownfield site.]]></description>
			<content type="html"><![CDATA[<h2 id="introduction">Introduction</h2>
<p>I have been asked several times on LinkedIn for an article on the topic of brownfield upgrades from VCF 5 to VCF 9.
Well, here we are. I thought it would be relatively easy and that you could just click your way through, but I was wrong.
But let&rsquo;s start at the beginning.</p>
<p>As things stand at the moment, if you have NSX version newer then 4.2.1.3 in your VCF environment, here&rsquo;s the bad news—it&rsquo;s not possible as of today.
NSX 4.2.1.4 and newer has a newer code base than NSX 9.0.0.0 and is therefore not compatible. The problem should be fixed with VCF 9.0.1.</p>
<figure><a href="01.png"><picture><source srcset="/vcf5-to-vcf9/01_hu_ead8116ed0948b0.png" type="image/png">
          <img
            src="/vcf5-to-vcf9/01_hu_ead8116ed0948b0.png"alt="Interoperability Matrix"width="708"
            height="367"/>
        </picture></a><figcaption><p>Interoperability Matrix (click to enlarge)</p></figcaption></figure>
<p>It is best to always check the Product Interoperability Matrix for this.</p>
<p>Since my last available VCF 5 lab was unfortunately already patched to the latest NSX version, I had no choice but to deploy a fresh VCF 5.2.1, as the NSX version here is low enough.
And oh my God, I really didn&rsquo;t miss the Excel or Json deployment of VCF 5.X, but for my readers&rsquo; sake, I&rsquo;ll struggle through it once again.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">I will only deploy one management domain in VCF 5.2.1, and I will also skip deploying the edge VMs.
However, this does not matter, as you always have to upgrade the management domain first, and the workload domains will function the same way afterwards.</div>
    </aside>
<h2 id="management-domain-and-basic-setup">Management domain and basic setup</h2>
<p>I have deployed a management domain with 4 ESXi servers and vSAN. I have set up the whole thing in a nested configuration.
Since the SDDC Manager does not yet support download tokens, I used the VMwareDepotChange script.
I have already covered this process in my <a href="https://sdn-warrior.org/posts/vcf-token/">blog</a>.</p>
<table>
  <thead>
      <tr>
          <th>Software Component</th>
          <th>Version</th>
          <th>Date</th>
          <th>Build Number</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Cloud Builder VM</td>
          <td>5.2.1</td>
          <td>09 OCT 2024</td>
          <td>24307856</td>
      </tr>
      <tr>
          <td>SDDC Manager</td>
          <td>5.2.1</td>
          <td>09 OCT 2024</td>
          <td>24307856</td>
      </tr>
      <tr>
          <td>VMware vCenter Server Appliance</td>
          <td>8.0 Update 3c</td>
          <td>09 OCT 2024</td>
          <td>24305161</td>
      </tr>
      <tr>
          <td>VMware ESXi</td>
          <td>8.0 Update 3b</td>
          <td>17 SEP 2024</td>
          <td>24280767</td>
      </tr>
      <tr>
          <td>VMware NSX</td>
          <td>4.2.1</td>
          <td>09 OCT 2024</td>
          <td>24304122</td>
      </tr>
  </tbody>
</table>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">I am not deploying Operations because I will integrate the VCF installation into my existing VCF 9 Operations later, after the upgrade.</div>
    </aside>
<p>After the initial deployment, I updated the SDDC to version 5.2.1.1, as it contained several bug fixes. Finally, I took a snapshot of the SDDC Manager, and we were ready to go.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">Broadcom has published a <a href="https://knowledge.broadcom.com/external/article/390634/update-sequence-for-vcf-90-and-compatibl.html">KB article</a> on the update sequence for VCF9. Feel free to take a look at it. Depending on which VCF components you are using, you may need to take other steps beforehand.</div>
    </aside>
<h2 id="uprade-sddc">Uprade SDDC</h2>
<p>In my setup, the first step is to upgrade the SDDC Manager. As I mentioned above, after the upgrade, the setup will be added to my VCF Operations, which is already up to date.</p>
<p><figure><a href="02.png"><picture><source srcset="/vcf5-to-vcf9/02_hu_c9c6aff32408d3a0.png" type="image/png">
          <img
            src="/vcf5-to-vcf9/02_hu_c9c6aff32408d3a0.png"alt="SDDC Manager"width="1700"
            height="874"/>
        </picture></a><figcaption><p>SDDC Manager upgrade (click to enlarge)</p></figcaption></figure>
The process is relatively straightforward. The 2 GB of data must be downloaded from the network via the SDDC, after which the upgrade can be initiated.</p>
<figure><a href="03.png"><picture><source srcset="/vcf5-to-vcf9/03_hu_696a5f8dcb47a053.png" type="image/png">
          <img
            src="/vcf5-to-vcf9/03_hu_696a5f8dcb47a053.png"alt="SDDC Manager status"width="1720"
            height="1259"/>
        </picture></a><figcaption><p>SDDC Manager upgrade status (click to enlarge)</p></figcaption></figure>
<p>The SDDC is booted once during the upgrade. I had a minor problem with the SDDC in that it no longer displayed the upgrade status after rebooting.
Instead, I only saw update status 0. I checked the log file /var/log/vmware/capengine/cap-update/workflow.log via SSH on the SDDC manager and looked to see if the following was visible:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">YYYY-MM-DDTHH:MM:SS.426528 workflow_manager.go:221: Task stage-cleanup completed
</span></span><span class="line"><span class="cl">YYYY-MM-DDTHH:MM:SS.426646 workflow_manager.go:183: All tasks finished for workflow
</span></span><span class="line"><span class="cl">YYYY-MM-DDTHH:MM:SS.426699 workflow_manager.go:354: Updating instance status to Completed
</span></span></code></pre></div><p>After that, I rebooted the SDDC again and the error was fixed. I had never had this problem before and couldn&rsquo;t find any KB articles about it. Maybe my nested environment was too slow. So take this fix with a grain of salt.
Finally, I configured my VCF9 offline repo in the SDDC as a repository, as I don&rsquo;t want to download all the VCF9 binaries from the network.</p>
<h2 id="binary-management">Binary Management</h2>
<p>SDDC version 9 supports the Broadcom download token, so there is no need to change anything with a script on the SDDC, but I prefer my central offline repo.
I built my own repo with the help of <a href="https://github.com/tsugliani/doc-vcf-offlinedepot">Timo Sugliani&rsquo;s instructions</a>.
It is the easiest way to get your own offline repo, even if you have no Linux knowledge. However, you still need a valid download token.</p>
<figure><a href="04.png"><picture><source srcset="/vcf5-to-vcf9/04_hu_98e33eaae8a5d692.png" type="image/png">
          <img
            src="/vcf5-to-vcf9/04_hu_98e33eaae8a5d692.png"alt="Binary Management"width="1704"
            height="807"/>
        </picture></a><figcaption><p>Binary Management (click to enlarge)</p></figcaption></figure>
<p>Next, I download the upgrade binaries for NSX, vCenter, and ESX to my SDDC manager.
The download is faster than the validation. It took me about 30 minutes to get everything I needed.</p>
<h2 id="starting-the-upgrade">Starting the upgrade</h2>
<p>I start the upgrade with the SDDC for the management domain. To do this, go to Workload Domains —&gt; Management Domain —&gt; Updates in the SDDC manager and then click on the Plan Upgrades button.
The process is actually exactly the same as for a normal VCF update.</p>
<figure><a href="05.png"><picture><source srcset="/vcf5-to-vcf9/05_hu_2ae425d6b399da1a.png" type="image/png">
          <img
            src="/vcf5-to-vcf9/05_hu_2ae425d6b399da1a.png"alt="Paln Upgrade"width="1721"
            height="978"/>
        </picture></a><figcaption><p>Plan Upgrade (click to enlarge)</p></figcaption></figure>
<p>I select VCF version 9.0.0.0 as the target version and confirm the version summary.</p>
<figure><a href="06.png"><picture><source srcset="/vcf5-to-vcf9/06_hu_abe67a33ca6fa587.png" type="image/png">
          <img
            src="/vcf5-to-vcf9/06_hu_abe67a33ca6fa587.png"alt="Version summary"width="1479"
            height="685"/>
        </picture></a><figcaption><p>Version summary (click to enlarge)</p></figcaption></figure>
<p>This completes the upgrade plan. It is important to note that the SDDC will only start the upgrade once we have clicked Configure Update.
The order of the components is as follows:</p>
<ul>
<li>NSX</li>
<li>vCenter</li>
<li>ESX hosts</li>
<li>vSAN</li>
</ul>
<p>You can only configure the update for one component at a time. Pre-checks must be performed before each update.
Here, too, the process is no different from a normal VCF update.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">Since I have a nested VCF setup with vSAN, there will be some errors. However, these can be ignored.</div>
    </aside>
<h3 id="nsx">NSX</h3>
<p>Since I don&rsquo;t have any edges deployed, the NSX upgrade runs without any problems.
If I had edge VMs in my environment, I would have had to intervene because of the Ryzen processors, otherwise the edge upgrade would have failed.
I explained why this is the case in my article on <a href="https://sdn-warrior.org/posts/ms-a2/#wheres-the-poop-robin">MS-A2</a>.</p>
<h3 id="vcenter">vCenter</h3>
<p>Next up is the vCenter. Here, the vCenter <em><strong>Reduced Downtime Update</strong></em> is used. This means that a second vCenter appliance is deployed in parallel and assigned a temporary IP address.
There are two options here: either I assign a static temporary IP address or I use Auto.</p>
<p>I usually use IP addresses defined by my customers, as firewalls are usually involved in whatever form.
Since I am working here in my lab without a firewall within the lab, I can use auto.
This gives the old and new vCenter a temporary APIPA IP to communicate with each other.
For the upgrade scheduler, I use Immediate and Automatic for the switchover.</p>
<figure><a href="07.png"><picture><source srcset="/vcf5-to-vcf9/07_hu_abdd3a6ae02197cf.png" type="image/png">
          <img
            src="/vcf5-to-vcf9/07_hu_abdd3a6ae02197cf.png"alt="vCenter update"width="1714"
            height="948"/>
        </picture></a><figcaption><p>vCenter update (click to enlarge)</p></figcaption></figure>
<h3 id="esx---houston-we-have-a-problem">ESX - Houston, we have a problem!</h3>
<p>The update ran successfully, but I have the problem that my vCenter simply does not want to display ESX9 images, which is essential because without ESX9 images, you cannot upgrade the ESX servers to 9.
I thought I was being clever and forced the vCenter to search for new images. You can do this using the vCenter Lifecycle Manager and push <em><strong>Sync Updates</strong></em>.</p>
<figure><a href="08.png"><picture><source srcset="/vcf5-to-vcf9/08_hu_921f3fbdcf534880.png" type="image/png">
          <img
            src="/vcf5-to-vcf9/08_hu_921f3fbdcf534880.png"alt="vCenter Lifecycle Manager"width="1663"
            height="1182"/>
        </picture></a><figcaption><p>vCenter Lifecycle Manager (click to enlarge)</p></figcaption></figure>
<p>Interestingly, the process is not running correctly. I hadn&rsquo;t encountered this before, and after a quick check in the task log in vCenter, I saw the following error message, which wasn&rsquo;t really helpful.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">A general system error occurred: Down load patch definitions task failed while synci ng depots. 
</span></span><span class="line"><span class="cl">Error: &#39;integrity.fault.Vclntegrityf ault: VMware Sphere Lifecycle Manager hi d an unknown error. 
</span></span><span class="line"><span class="cl">Check the events and I og files for details.
</span></span></code></pre></div><figure><a href="09.png"><picture><source srcset="/vcf5-to-vcf9/09_hu_c0e114b7232b8a7e.png" type="image/png">
          <img
            src="/vcf5-to-vcf9/09_hu_c0e114b7232b8a7e.png"alt="vCenter Tasks"width="1406"
            height="451"/>
        </picture></a><figcaption><p>vCenter Tasks (click to enlarge)</p></figcaption></figure>
<p>Of course, there is nothing specific in the logs either. So Google research&hellip;Because the second most important skill in my IT career is knowing how to Google properly—maybe it&rsquo;s even my most important skill.</p>
<p>Turns out it&rsquo;s a known bug in vCenter 9 when performing a brownfield upgrade from 8 to 9.
VMware has created a <a href="https://knowledge.broadcom.com/external/article?articleNumber=402037">KB article</a> on this topic.
The solution is relatively simple: you need to delete metadata from the vCenter DB in the tables <em><strong>vci_updates</strong></em> and <em><strong>vci_update_packages</strong></em>.</p>
<h3 id="vcenter-workaround">vCenter workaround</h3>
<ul>
<li>Lgin to vCenter as Root</li>
<li>Stop updatemgr service</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">service-control --stop updatemgr
</span></span></code></pre></div><ul>
<li>Change to &lsquo;updatemgr&rsquo; user.</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">su updatemgr -s /bin/bash
</span></span></code></pre></div><ul>
<li>Run psql command to patch the DB</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">psql -U vumuser -d VCDB -c &#34;UPDATE vci_updates set deleted = 0, hidden = 0 where meta_uid like &#39;ESXi7%&#39; or meta_uid like &#39;esxi7%&#39; or meta_uid like &#39;ESXi_7%&#39; or meta_uid like &#39;esxi_7%&#39;; DELETE FROM vci_updates where id not in (select update_id from vci_update_packages);&#34;
</span></span></code></pre></div><ul>
<li>Start updatemgr service</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">service-control --start updatemgr
</span></span></code></pre></div><p>After the changes and the restart of the service, vCenter automatically syncs the image data. Now a new image can be created.</p>
<figure><a href="10.png"><picture><source srcset="/vcf5-to-vcf9/10_hu_9fe426594b598ab4.png" type="image/png">
          <img
            src="/vcf5-to-vcf9/10_hu_9fe426594b598ab4.png"alt="vCenter Images"width="1356"
            height="682"/>
        </picture></a><figcaption><p>vCenter Lifecycle Manager after workaround (click to enlarge)</p></figcaption></figure>
<h3 id="esx-image">ESX Image</h3>
<p>For Image Mode, we still need to import an image into the SDDC; without this, the ESX servers cannot be upgraded to 9.
After fixing the vCenter, I create a dummy cluster in the vCenter and select the correct image for VCF9.</p>

    <aside class="admonition attention">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" class="feather feather-link" width="24" height="24" viewBox="0 0 24 24"
      fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
      <path d="M10 13a5 5 0 0 0 7.54.54l3-3a5 5 0 0 0-7.07-7.07l-1.72 1.71"></path>
      <path d="M14 11a5 5 0 0 0-7.54-.54l-3 3a5 5 0 0 0 7.07 7.07l1.71-1.71"></path>
   </svg></div><b>Attention</b>
        </div>
        <div class="admonition-content">It is important that the image from the BOM is used. For VCF9.0.0.0, this is ESX 9.0.0.0 Build 24755229.
There is now a patch for ESX 9 (9.0.0.0100), but this cannot be used for the upgrade to VCF9.
Instead, it must be installed via VCF 9 Operations after the upgrade to VCF9 has been successfully completed.</div>
    </aside>
<p>The image is then imported into the SDDC and the dummy cluster can be deleted again in vCenter.
The image is assigned to the management domain cluster in the “Configure Update” step, and after the pre-checks, the ESX servers can be updated.
The process ran smoothly - easy, that&rsquo;s what I expected from the start.</p>

    <aside class="admonition attention">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" class="feather feather-link" width="24" height="24" viewBox="0 0 24 24"
      fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
      <path d="M10 13a5 5 0 0 0 7.54.54l3-3a5 5 0 0 0-7.07-7.07l-1.72 1.71"></path>
      <path d="M14 11a5 5 0 0 0-7.54-.54l-3 3a5 5 0 0 0 7.07 7.07l1.71-1.71"></path>
   </svg></div><b>Attention</b>
        </div>
        <div class="admonition-content">Before each update, I check vSAN Skyline Health and silence any alarms that may be present due to nested virtualization and the resulting non-certified hardware.</div>
    </aside>
<h3 id="vsan">vSAN</h3>
<p>The final step in the actual upgrade process is to update the vSAN storage.
The process is straightforward, and all you need to do is press the upgrade button under cluster &ndash;&gt; configure &ndash;&gt; vSAN.
Depending on the hardware and size of the vSAN storage, this process can take a relatively long time.</p>
<figure><a href="11.png"><picture><source srcset="/vcf5-to-vcf9/11_hu_180cbcea5e64a3d2.png" type="image/png">
          <img
            src="/vcf5-to-vcf9/11_hu_180cbcea5e64a3d2.png"alt="vSAN"width="1715"
            height="672"/>
        </picture></a><figcaption><p>vSAN upgrade (click to enlarge)</p></figcaption></figure>
<p>Now everything is ready for me to start onboarding in Operations.</p>
<h3 id="onboarding-in-vcf-operations">Onboarding in VCF Operations</h3>
<p>I have a central VCF Operations because I don&rsquo;t want to deploy an entire fleet every time I need a new nested lab.
I described how I built it in this <a href="https://sdn-warrior.org/posts/vcf-operations-migration/">article</a>.
When onboarding in Operations, there are really only two things to keep in mind.
I have to configure the new VCF environment as a <em><strong>deployment target</strong></em>, and I have to enable <em><strong>activate management</strong></em>, otherwise I cannot distribute licenses for the environment with my VCF Operations instance.</p>
<p>To onboard, I need to add the VCF instance to my Ops via VCF Operations —&gt; Administration —&gt; Integrations —&gt; Accounts —&gt; Vmware Cloud Foundation.
Once the environment appears under Integrations, you can select it and press the Activate Management button. This allows you to assign a license to the VCF instance.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">It may take up to 15 minutes for the setting to become active and for VCF Operations to display the VCF instance in the licensing dialog.</div>
    </aside>
<p>Finally, the VCF instance is set up as an deployment target under —&gt; Lifecycle —&gt; VCF Management —&gt; Settings —&gt; Deployment Targets.</p>
<p>The upgrade is now complete and the VCF 9 instance can be used as normal.
Optionally, you can install the ESX patch under Lifecycle.
To do this, you must create a new image in vCenter and import it via Ops.
The process is more or less identical to the previous update.</p>
<figure><a href="12.png"><picture><source srcset="/vcf5-to-vcf9/12_hu_4a8510897aca01db.png" type="image/png">
          <img
            src="/vcf5-to-vcf9/12_hu_4a8510897aca01db.png"alt="VCF Operations"width="1437"
            height="1049"/>
        </picture></a><figcaption><p>VCF Operations (click to enlarge)</p></figcaption></figure>
<h2 id="final-thoughts">Final thoughts</h2>
<p>The upgrade was more difficult than expected.
The missing images in particular took me longer than I would like to admit.
To be honest, I thought it would be a straightforward process.
My VCF 5.2.1 environment was completely new, with no previous updates or other legacy issues.</p>
<p>I think VCF 9.0.1 will remedy this, because in theory the upgrade process is very simple and doesn&rsquo;t really differ much from a normal patch update in VCF5.X.</p>
<p>What I still want to test is an import, i.e., converting a standalone NSX installation into a workload domain.
Unfortunately, the same applies here as with my old NSX Labs: when updating my physical lab, I destroyed most of my old setups (because I&rsquo;m sometimes just stupid).
Only labs that are unusable in this case have survived.</p>
<p>Well, I&rsquo;ll do another NSX 4.2.1 deployment at some point. But that will be another blog post.
If you have any questions or comments, please feel free to send me a message on LinkedIn or to my blog email address.</p>
]]></content>
		</item>
		
		<item>
			<title>VMware Explore 2025</title>
			<link>https://sdn-warrior.org/posts/explore-2025/</link>
			<pubDate>Sun, 31 Aug 2025 12:00:00 +0000</pubDate>
			
			<guid>https://sdn-warrior.org/posts/explore-2025/</guid>
			<description><![CDATA[My VMware Explore 2025]]></description>
			<content type="html"><![CDATA[<h2 id="viva-las-vegas---a-kind-of-travelogue">Viva Las Vegas - A kind of travelogue.</h2>
<p>At the beginning of the year, I had the pleasure of flying to the <a href="https://sdn-warrior.org/posts/vmug-connect-stl-2025/">USA</a>, and now I&rsquo;m on my way to Las Vegas.
After a slightly delayed departure, I arrived in Vegas after an 11.5-hour flight in a &ldquo;comfortable&rdquo; 40 degrees Celsius or 104 degrees Fahrenheit, for those who do not use the metric system.
Perhaps one more small note. I won&rsquo;t write much about the sessions; anyone who wants to can look them up in <a href="https://github.com/lamw/vmware-explore-2025-session-urls">William Lam&rsquo;s repository</a>. I want to focus more on the vibe of the event here—that is, I&rsquo;ll try to. I&rsquo;m not really a travel blogger.</p>
<figure><a href="01.jpg"><picture><source srcset="/explore-2025/01_hu_c3f6eae0b817e783.jpg" type="image/jpeg">
          <img
            src="/explore-2025/01_hu_c3f6eae0b817e783.jpg"alt="Viva Las Vegas"width="1944"
            height="1458"/>
        </picture></a><figcaption><p>Viva Las Vegas (click to enlarge)</p></figcaption></figure>
<p>But before I talk or write about Explore itself, I&rsquo;ll first write a little about Las Vegas. When you hear Las Vegas, you have a certain cliché in your head—at least I do.
The city is loud, flashy, and never sleeps, and what can I say, it&rsquo;s all true.
The entertainment dial is turned up to 11—a sea of lights from flashing hotels, casinos, and some very obscure billboards, or even trucks that serve as rolling, illuminated billboards.
As a German, this is a culture shock at first.</p>
<figure><a href="02.jpg"><picture><source srcset="/explore-2025/02_hu_85ced394f9588ce9.jpg" type="image/jpeg">
          <img
            src="/explore-2025/02_hu_85ced394f9588ce9.jpg"alt="Old Vegas"width="1411"
            height="1058"/>
        </picture></a><figcaption><p>Old Vegas (click to enlarge)</p></figcaption></figure>
<p>But if you prefer something a little quieter, I can give you two recommendations. These are certainly not insider tips, but then again, I&rsquo;m not a travel blogger.</p>
<h2 id="hoover-dam-and-mike-ocallaghanpat-tillman-memorial-bridge">Hoover Dam and Mike O’Callaghan–Pat Tillman Memorial Bridge</h2>
<p>Visiting Hoover Dam feels like standing in front of a monument that engineers built just to prove they could. In short, it&rsquo;s a huge thing.</p>
<figure><a href="03.jpg"><picture><source srcset="/explore-2025/03_hu_a3e187e831d2c002.jpg" type="image/jpeg">
          <img
            src="/explore-2025/03_hu_a3e187e831d2c002.jpg"alt="Hoover Dam"width="3000"
            height="2000"/>
        </picture></a><figcaption><p>Hoover Dam (click to enlarge)</p></figcaption></figure>
<p>The concrete wall on the border between Nevada and Arizona was completed in 1936 and still supplies electricity to millions of households today.
The Hoover Dam is so massive that it slows down the Earth&rsquo;s rotation slightly when Lake Mead is full—only a tiny fraction of a second, but the idea alone makes me smile.
When you&rsquo;re at the dam, you can walk back and forth between two states and two time zones. That gives you an extra hour to drink beer—I mean coffee.</p>
<p>Right next to it is the Mike O&rsquo;Callaghan–Pat Tillman Memorial Bridge, better known as the Hoover Dam Bypass Bridge.
From here, you have a spectacular panoramic view of the dam—almost like something out of a postcard.</p>
<figure><a href="04.jpg"><picture><source srcset="/explore-2025/04_hu_9c742af3a0038c4b.jpg" type="image/jpeg">
          <img
            src="/explore-2025/04_hu_9c742af3a0038c4b.jpg"alt="view from the bridge"width="2142"
            height="1606"/>
        </picture></a><figcaption><p>view from the bridge (click to enlarge)</p></figcaption></figure>
<p>There is a museum at Hoover Dam, but I skipped it and therefore can&rsquo;t really say anything about it. However, my colleagues who visited it thought it was good.
I preferred to go directly to the Valley of Fire to have more time there.</p>
<h2 id="valley-of-fire">Valley of Fire</h2>
<p>The Valley of Fire is located about 80 km northeast of Las Vegas and can be reached in less than an hour by car.
The park covers around 186 square kilometres, making it the oldest and largest state park in Nevada.
It goes without saying, but it&rsquo;s a desert and it&rsquo;s hot, so be sure to take water with you. You can buy some at the visitor centre or fill up empty bottles for free.
It was about 110 degrees Fahrenheit that day, which is around 43 degrees Celsius, which a taxi driver said was cool.
To me, it felt like sticking my head in a hot air fryer. It may not be the best idea to drive to a desert in the summer, but the landscape makes up for it.</p>
<figure><a href="05.jpg"><picture><source srcset="/explore-2025/05_hu_e25055781f880521.jpg" type="image/jpeg">
          <img
            src="/explore-2025/05_hu_e25055781f880521.jpg"alt="Valley of Fire"width="3000"
            height="2000"/>
        </picture></a><figcaption><p>Valley of Fire (click to enlarge)</p></figcaption></figure>
<p>I don&rsquo;t think I&rsquo;ve ever sweated so much in my life, but it was a fantastic day.
There are five signposted short walks, all between one and two miles long, which are manageable with sufficient sun cream and water.
There is also a road through the park, allowing you to cruise through beautiful and striking scenery in a typical American car.</p>
<p>If that&rsquo;s too low-tech for you, you can also visit the Sphere. It&rsquo;s like an IMAX cinema on steroids, although that doesn&rsquo;t really do it justice.</p>
<h2 id="sphere">Sphere</h2>
<p>Who doesn&rsquo;t know it, the gigantic sphere that peeks shyly into one hotel room or another?
The Sphere is approximately 112 metres high and 157 metres in diameter, a rather large sphere that uses LEDs on the outside to display dancing cats, smilies and other things 24/7.
You can see it as you approach the airport.</p>
<figure><a href="06.jpg"><picture><source srcset="/explore-2025/06_hu_9861b1323d04c8eb.jpg" type="image/jpeg">
          <img
            src="/explore-2025/06_hu_9861b1323d04c8eb.jpg"alt="the Sphere"width="1926"
            height="1444"/>
        </picture></a><figcaption><p>Dacing cats at the Sphere (click to enlarge)</p></figcaption></figure>
<p>We watched
Postcard from Earth by Darren Aronofsky, and what can I say? Audiovisually, it was the most impressive experience I&rsquo;ve ever had.
It&rsquo;s very difficult to describe, and even the YouTube videos you can find don&rsquo;t do the experience justice.
If you have the opportunity to visit the sphere, then do it. Highly recommended.</p>
<h2 id="explore-2025">Explore 2025</h2>
<p>But enough with the travel tips, let&rsquo;s get started with the event. Since I&rsquo;ve only been to European events so far, the first big difference to Barcelona is that Explore is located on several floors of a hotel.
This makes the event a little more compact and not quite as sprawling as at the Fira Gran Via.
Depending on which hotel you&rsquo;re staying in, you might not even see daylight—if you want to avoid it.</p>
<figure><a href="07.jpg"><picture><source srcset="/explore-2025/07_hu_38a6f99667ebd7f2.jpg" type="image/jpeg">
          <img
            src="/explore-2025/07_hu_38a6f99667ebd7f2.jpg"alt="Event Pass"width="1512"
            height="2016"/>
        </picture></a><figcaption><p>Event Pass (click to enlarge)</p></figcaption></figure>
<p>In terms of organization, everything was once again tip-top. Easy check-in, easy to find your way around, friendly staff everywhere who helped you find your way.
What immediately stands out is that it&rsquo;s less busy than expected. I don&rsquo;t have any figures, but the days of 10,000 visitors and more are long gone.
However, with the absence of EUC, Carbon Black, and smaller partners, this was to be expected.
From a visitor&rsquo;s point of view, however, this is not necessarily a bad thing, as you could get into every session without any problems, even if you hadn&rsquo;t registered in advance or the session was officially fully booked.
There were no long queues anywhere, and there were always plenty of snacks, refreshments, and coffee available. I&rsquo;ve had different experiences in Barcelona.</p>
<h2 id="day-1">Day 1</h2>
<p>Day 1 started for me with a session on edge computing. It was co-hosted by Audi. Edge computing is a really exciting topic, and my employer evoila will also be presenting it at our in-house exhibition.
So it was fascinating to hear about the topic from a customer perspective and learn how Audi is using it.
The possibility of building a compact single-host edge cluster with vSAN storage, NSX networking, and automation tools such as ArgoCD in a VCF context has a lot of potential.</p>
<p>After that, I attended a session with <a href="https://vgandalf.com/">Tim Burkhard</a>, who explained the network control plane.
Since NSX is my area of expertise, there was nothing new for me here, but sessions with vGandalf are always entertaining and bring out the fanboy in me.</p>
<p>11/10 I would definitely attend again. Of course, a photo with him is a must.</p>
<figure><a href="08.jpg"><picture><source srcset="/explore-2025/08_hu_3bd24a93c5072dbe.jpg" type="image/jpeg">
          <img
            src="/explore-2025/08_hu_3bd24a93c5072dbe.jpg"alt="vGandalf"width="3000"
            height="2000"/>
        </picture></a><figcaption><p>vGandalf and the SDN-Warrior (click to enlarge)</p></figcaption></figure>
<p>I then went on to other sessions, including William Lam&rsquo;s, but this time I didn&rsquo;t manage to get a photo. Next time, then.
In between, I attended the vExpert Community meeting, picked up my swag, and talked to lots of vExperts.</p>
<h3 id="community">Community</h3>
<p>And while we&rsquo;re on the subject of community, I was overwhelmed by how many people stopped to say hello.
I can hardly put it into words, but it was incredible. Thank you so much, it was amazing, and I am grateful to everyone who reads my LinkedIn posts or my blog.
I&rsquo;m still surprised when someone asks me if they can take a photo with me.
Thank you for this incredible support.</p>
<figure><a href="09.jpg"><picture><source srcset="/explore-2025/09_hu_93d93be2520b528d.jpg" type="image/jpeg">
          <img
            src="/explore-2025/09_hu_93d93be2520b528d.jpg"alt="Sebastian Garcia and me"width="2048"
            height="1152"/>
        </picture></a><figcaption><p>Sebastian Garcia and me (click to enlarge)</p></figcaption></figure>
<p>Photo credits go to <a href="https://www.linkedin.com/posts/activity-7367001767860994048-Gcjm?utm_source=share&amp;utm_medium=member_desktop&amp;rcm=ACoAADCvLVcBxkTlNAba6yJqH706PjezWwpyC8I">Sebastian Garcia</a>. An absolutely friendly person. Thank you for the nice conversation.</p>
<p>Day one ended with the welcome reception, after which my colleagues and I hit the Vegas nightlife.</p>
<h2 id="day-2">Day 2</h2>
<p>The second day started with little sleep and slight jet lag at breakfast on the Explore.
Although breakfast is a funny thing—I understand it to mean something else. This was more like lunch.
After enough coffee, I headed to the general keynote, which had the somewhat unwieldy title Shaping the Future of Private Cloud and AI Innovation.
What was special about the general keynote for me was that I was asked if I would like to contribute a video clip for Paul Turner&rsquo;s part.
<figure><a href="10.jpg"><picture><source srcset="/explore-2025/10_hu_48793013a8c45e45.jpg" type="image/jpeg">
          <img
            src="/explore-2025/10_hu_48793013a8c45e45.jpg"alt="general keynote"width="1272"
            height="713"/>
        </picture></a><figcaption><p>vExperts @ General Keynote (click to enlarge)</p></figcaption></figure></p>
<p>My contribution was shown alongside four other well-known vExperts, including <a href="https://williamlam.com/">William Lam</a>, <a href="https://vmiss.net/">Melissa Palmer</a>, <a href="https://www.ntpro.nl/blog/">Eric Sloof</a> and <a href="https://www.yellow-bricks.com/">Duncan Epping</a>.
Seeing my own face on the main stage in Las Vegas, alongside the other great vExperts, was a bit strange but also great at the same time.</p>
<p>I filmed my part from the audience and uploaded it to YouTube.</p>
<div class="video-container">
  <iframe width="560" height="315"
    src="https://www.youtube-nocookie.com/embed/Iiosv_2kYpY"
    frameborder="0"
    allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
    allowfullscreen>
  </iframe>
</div>
<p>The complete keynote can currently only be viewed directly at <a href="https://www.vmware.com/explore/video-library/video/6377276035112">VMware</a>.</p>
<p>After this highlight, I withdrew to prepare myself mentally for the upcoming certification.
With the full pass, you could complete various certifications on site, and I took the opportunity to take my VCP VCF9 Administrator exam, which I successfully completed.
As a small bonus, everyone who successfully certified received a pair of sneakers.
<figure><a href="11.jpg"><picture><source srcset="/explore-2025/11_hu_908199a26a73ceaf.jpg" type="image/jpeg">
          <img
            src="/explore-2025/11_hu_908199a26a73ceaf.jpg"alt="cert swag"width="2142"
            height="2856"/>
        </picture></a><figcaption><p>Swag (click to enlarge)</p></figcaption></figure>
Swag is always welcome.</p>
<p>In addition to the general keynote, the highlight for me was the session by Duncan Epping and Rakesh Radhakrishnan entitled
<a href="https://event.vmware.com/flow/vmware/explore2025lv/content/page/catalog/session/1742370482000001PYRW">Innovations Redefining Storage and Disaster Recovery for VMware Cloud Foundation 9</a>.
I found it particularly exciting that it will be possible to use vSAN over Fibre Channel in the future, which makes me, as an old storage guy, especially happy.
It should be possible to connect a vSAN storage-only cluster to an existing FC fabric via FC, allowing vSAN and FC storage to be used simultaneously.
This combines the best of both worlds and allows drop-in replacements to be implemented without making my FC fabric obsolete.</p>
<figure><a href="12.jpg"><picture><source srcset="/explore-2025/12_hu_21884b2811e7ec94.jpg" type="image/jpeg">
          <img
            src="/explore-2025/12_hu_21884b2811e7ec94.jpg"alt="Wakuda"width="1958"
            height="1468"/>
        </picture></a><figcaption><p>Fancy entrance to Wakuda (click to enlarge)</p></figcaption></figure>
<p>The day ended with the VMUG reception at Wakuda, where Hock Tan personally addressed the VMUG community.
He promised VMUG support and emphasized how important VMUG is for VMware and Broadcom.</p>
<p><figure><a href="13.jpg"><picture><source srcset="/explore-2025/13_hu_a0665f3411d3d2fe.jpg" type="image/jpeg">
          <img
            src="/explore-2025/13_hu_a0665f3411d3d2fe.jpg"alt="Wakuda vExpert"width="761"
            height="571"/>
        </picture></a><figcaption><p>Edd and me talking about stuff (click to enlarge)</p></figcaption></figure>
There was plenty of space for pleasant conversations at Wakuda. Here, I had the opportunity to chat with Edd. By the way, you should check out his <a href="https://vxworld.co.uk/">blog</a>.</p>
<p>After that, I threw myself into the nightlife just like before. That&rsquo;s a pattern and it will continue.</p>
<h2 id="day-3">Day 3</h2>
<p>This day started off rather leisurely for me. I didn&rsquo;t have a session until 11, so I used the time to prepare this article and chat with a few vExperts.
I also took a closer look at the expo and the community hub and had a few conversations there as well.
I then attended the VMUG Leaders and VMware Champions Meet-up and tried in vain to take another exam.
Unfortunately, there were no more slots available, and my place on the waiting list didn&rsquo;t help either.</p>
<figure><a href="14.jpg"><picture><source srcset="/explore-2025/14_hu_40462cec45c1f15a.jpg" type="image/jpeg">
          <img
            src="/explore-2025/14_hu_40462cec45c1f15a.jpg"alt="VMUG and Champions"width="1644"
            height="2192"/>
        </picture></a><figcaption><p>VMUG Leaders and VMware Champions Meet-up (click to enlarge)</p></figcaption></figure>
<p>As part of VMware Champions, there will also be a session on VPCs with me. Exact details are not yet known.</p>
<p>Sadly, I missed the vExpert group photo while trying to get a spot for the exam.
My colleague <a href="https://sdn-techtalk.com/">Steven Schramm</a> and I then went to the Knight Welcome Reception.</p>
<p>Work hard, play hard. Afterwards, there was the Explore closing party on the pool deck of the Palazzo hotel with music and cool refreshments. Which was, of course, fantastic.</p>
<figure><a href="15.jpg"><picture><source srcset="/explore-2025/15_hu_cf4bcaaa821e7d91.jpg" type="image/jpeg">
          <img
            src="/explore-2025/15_hu_cf4bcaaa821e7d91.jpg"alt="Vparty"width="1944"
            height="1458"/>
        </picture></a><figcaption><p>The Party (click to enlarge)</p></figcaption></figure>
<p>For most of the evoila team, this was also the last day of Explore. For Steven and me, things continued on Thursday with an exclusive full-day event for Broadcom Knights.</p>
<h2 id="knight-day">Knight Day</h2>
<p>The day started at 8 a.m. with an update on the Knight program and what changes will be made. Spoiler alert: I have to complete a number of certifications.
The rest of the day was spent looking at the roadmap and a condensed version of the last three days.
<figure><a href="16.jpg"><picture><source srcset="/explore-2025/16_hu_72a5e58efcbb3853.jpg" type="image/jpeg">
          <img
            src="/explore-2025/16_hu_72a5e58efcbb3853.jpg"alt="knight"width="2142"
            height="2856"/>
        </picture></a><figcaption><p>Knight Training (click to enlarge)</p></figcaption></figure>
Since the day is under NDA, I can&rsquo;t say anything about the roadmap except that I&rsquo;m looking forward to the many new features in VCF 9.X.
The day ended with an evening event at the Excalibur Hotel called Tournament of Kings. In a not-too-historically-accurate version of the European Middle Ages, the battle between good and evil was fought over a cold beer and fried chicken.
<figure><a href="17.jpg"><picture><source srcset="/explore-2025/17_hu_a5a02d8646827dec.jpg" type="image/jpeg">
          <img
            src="/explore-2025/17_hu_a5a02d8646827dec.jpg"alt="knight event"width="2741"
            height="2056"/>
        </picture></a><figcaption><p>Tournament of Kings (click to enlarge)</p></figcaption></figure>
It was dramatic, loud, American, and very entertaining. Thank you for the wonderful day, dear Knight Team.</p>
<h2 id="last-words">Last words</h2>
<p>Since everything must come to an end, Friday was our departure day, and we spent our last dollars on food and boarded our flight back across the pond.
What remains? A great event, wonderful conversations, new and old faces, and a lot of gratitude for everything.
Thanks also to my employer, <a href="https://evoila.com/">evoila</a>, who sent 13 people to Las Vegas, making us the largest German group there.
Oh, and we achieved <a href="https://evoila.com/de/partner/vmware/">Pinnacle status</a>, which is also worth mentioning.
I&rsquo;m already looking forward to the upcoming Knight exclusive events and Explore on Tour in Germany.</p>
]]></content>
		</item>
		
		<item>
			<title>VCF 9 - Certificate exchange</title>
			<link>https://sdn-warrior.org/posts/vcf9-cert-exchange/</link>
			<pubDate>Tue, 19 Aug 2025 09:21:00 +0200</pubDate>
			
			<guid>https://sdn-warrior.org/posts/vcf9-cert-exchange/</guid>
			<description><![CDATA[A brief guide to replacing certificates in VCF9 with VCF Operations.]]></description>
			<content type="html"><![CDATA[<h2 id="certificates--the-mother-of-all-evils">Certificates – the mother of all evils</h2>
<p>In the last 25 years of working in IT, I have never met anyone who is happy when certificates expire and need to be replaced – quite the opposite, in fact.
Certificates are like printers: everyone hates them, but we need them.
Fortunately, VCF and VCF9 have greatly simplified the entire procedure.
In this article, I would like to briefly explain how you can use VCF Operations and a Microsoft CA to exchange certificates easily and, in some cases, automatically – guaranteed pain-free.
However, it is still not an enjoyable process.</p>
<h2 id="lets-get-started">Let&rsquo;s get started</h2>
<p>VCF (VCF9) currently supports two integrations of certificate authorities. It supports Microsoft CA or OpenSSL.
Since I already have a Windows domain running, I simply use Microsoft CA for the sake of simplicity.
Nevertheless, the completely manual method is also supported, which I will show at the end of this article.</p>
<p>I will use the Microsoft Certification Authority Web Enrollment Service. If you want to use fully automated certificate enrolment with a microsoft CA, there is no way around this.
Please note that the web service requires basic authentication and generally poses a security risk if everything is not configured correctly.</p>

    <aside class="admonition attention">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" class="feather feather-link" width="24" height="24" viewBox="0 0 24 24"
      fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
      <path d="M10 13a5 5 0 0 0 7.54.54l3-3a5 5 0 0 0-7.07-7.07l-1.72 1.71"></path>
      <path d="M14 11a5 5 0 0 0-7.54-.54l-3 3a5 5 0 0 0 7.07 7.07l1.71-1.71"></path>
   </svg></div><b>Attention</b>
        </div>
        <div class="admonition-content">Here I show a basic setup of the RootCA. This is not set up according to Microsoft&rsquo;s best practices, and the servers should be hardened as a matter of urgency.
In addition, separate accounts should be used in accordance with the least privilege principle.</div>
    </aside>
<h2 id="preparing-the-rootca">Preparing the RootCA</h2>
<p>The following server roles must be enabled on the CA:</p>
<ul>
<li>Certification Authority</li>
<li>Certification Authority Web Enrollment</li>
</ul>
<p>After installing the roles, Basic Authentication must be enabled in IIS.</p>
<figure><a href="01.png"><picture><source srcset="/vcf9-ca/01_hu_2099d9062f9f2ac8.png" type="image/png">
          <img
            src="/vcf9-ca/01_hu_2099d9062f9f2ac8.png"alt="IIS Settings"width="1195"
            height="543"/>
        </picture></a><figcaption><p>IIS Settings (click to enlarge)</p></figcaption></figure>
<p>Next, a dedicated certificate template must be created. This template defines the attributes used to sign the VCF component certificates.
To do this, I start the Certificate Templates console by running certtmpl.msc.
I duplicate the default template Web Server.</p>
<p>In the new template, adjust the compatibility settings so that Windows Server 2008 R2 is specified as the certification authority and Windows 7 / Server 2008 R2 as the certificate recipient.
Give the template a unique name, e.g. VMware.</p>
<p>Remove ‘Server authentication’ from the application policies under Extensions, enable the ‘Basic Constraints‘ extension, and in the key usage settings, enable ‘Signature is proof of origin (non-repudiation)’ while keeping the default settings for the rest.
On the Subject Name tab, make sure that ’Supply in the request’ is selected. Save the new template once all settings have been applied.</p>
<p>Finally, add the template to the Microsoft certification authority. To do this, open ‘certsrv.msc’, go to ‘Certificate Templates’ and select ‘New → Certificate Template to Issue’.
Select the template you just created and activate it.</p>
<figure><a href="02.png"><picture><source srcset="/vcf9-ca/02_hu_e679ace365a35fcd.png" type="image/png">
          <img
            src="/vcf9-ca/02_hu_e679ace365a35fcd.png"alt="Template settings"width="968"
            height="750"/>
        </picture></a><figcaption><p>Template settings (click to enlarge)</p></figcaption></figure>
<p>The final step is to assign the rights. The CA user requires the following rights:</p>
<ul>
<li>Issue and Manage Certificates</li>
<li>Request Certificates</li>
</ul>

    <aside class="admonition attention">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" class="feather feather-link" width="24" height="24" viewBox="0 0 24 24"
      fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
      <path d="M10 13a5 5 0 0 0 7.54.54l3-3a5 5 0 0 0-7.07-7.07l-1.72 1.71"></path>
      <path d="M14 11a5 5 0 0 0-7.54-.54l-3 3a5 5 0 0 0 7.07 7.07l1.71-1.71"></path>
   </svg></div><b>Attention</b>
        </div>
        <div class="admonition-content">As already mentioned, I use the administrator in my lab, which is not safe for production and should be changed.</div>
    </aside>
<p>This completes the process on the Windows side. The CA can now be used either manually or automatically via VCF Operations.</p>
<h2 id="preparing-vcf-operations">Preparing VCF Operations</h2>
<p>The cool thing is that you can now not only change VCF certificates, but also, if you use a central VCF operations like I do, you can also change these certificates.</p>

    <aside class="admonition attention">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" class="feather feather-link" width="24" height="24" viewBox="0 0 24 24"
      fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
      <path d="M10 13a5 5 0 0 0 7.54.54l3-3a5 5 0 0 0-7.07-7.07l-1.72 1.71"></path>
      <path d="M14 11a5 5 0 0 0-7.54-.54l-3 3a5 5 0 0 0 7.07 7.07l1.71-1.71"></path>
   </svg></div><b>Attention</b>
        </div>
        <div class="admonition-content">VCF management components only support Microsoft Certificate Authority. VCF Instance components support both Microsoft Certificate Authority and OpenSSL.</div>
    </aside>
<p>This is also practical, as I can store different CAs per VCF instance in VCF Operations. Perhaps I will write a second article in which I do the whole thing via OpenSSL CA.
To set the CA for a VCF, go to <em><strong>Fleet Management</strong></em> -&gt; <em><strong>Certificates</strong></em> and then select the VCF instance or VCF Management, depending on whether it is to be configured for the management components or the VCF instance.</p>
<figure><a href="03.png"><picture><source srcset="/vcf9-ca/03_hu_f9605ba0750f6018.png" type="image/png">
          <img
            src="/vcf9-ca/03_hu_f9605ba0750f6018.png"alt="CA Settings"width="1440"
            height="921"/>
        </picture></a><figcaption><p>CA Settings (click to enlarge)</p></figcaption></figure>
<p>The settings are self-explanatory. It is important that the complete username with domain is used and that the template is written correctly (case sensitive).
For the CA server URL, https is recommended, as basic authentication is used, which means that the user and password are transmitted in plain text.
Therefore, as a minimum security measure, transport encryption with TLS should be used here.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">If firewalls are used, communication between SDDC and the RootCA must of course be ensured.
Time is also a very critical issue here, so all components should be synchronised with the same NTP server.</div>
    </aside>
<h2 id="request-certificate">Request certificate</h2>
<p>In my case, I want to replace the vCenter certificate in my workload domain.
To do this, I select my workload domain in VCF Operations, click on the component that is to receive a new certificate, and select <em><strong>Generate CSR</strong></em>.</p>
<figure><a href="04.png"><picture><source srcset="/vcf9-ca/04_hu_eee040b0b6618b03.png" type="image/png">
          <img
            src="/vcf9-ca/04_hu_eee040b0b6618b03.png"alt="request certificate"width="1704"
            height="951"/>
        </picture></a><figcaption><p>request certificate (click to enlarge)</p></figcaption></figure>
<p>The Generate CSR dialogue box appears and some information must be added.
The Common Name, Host and Subject Alternative Name (SAN) are pre-filled and should not be changed. The following fields must be completed or modified:</p>
<ul>
<li>Organisation</li>
<li>Organisation Unit</li>
<li>Country</li>
<li>State/Province</li>
<li>Locality</li>
<li>Key Size</li>
</ul>
<figure><a href="05.png"><picture><source srcset="/vcf9-ca/05_hu_cf43a6a3ef5728b6.png" type="image/png">
          <img
            src="/vcf9-ca/05_hu_cf43a6a3ef5728b6.png"alt="request certificate details"width="1145"
            height="1022"/>
        </picture></a><figcaption><p>request certificate details (click to enlarge)</p></figcaption></figure>
<p>Once saved, a new certificate request is created in the background. The private key remains on the server that created it.
You can download the request if a manual signing request is used or required.
This would be the case if you do not want to use the Microsoft Certification Authority Web Enrollment Service for security reasons or if you do not use either Microsoft or OpenSSL CA.</p>
<p>Once the certificate request has been successfully created, all you need to do is select <em><strong>Replace With Configured CA Certificate</strong></em> and the automatic enrolment process will start.</p>
<figure><a href="06.png"><picture><source srcset="/vcf9-ca/06_hu_6614813af0cb3ade.png" type="image/png">
          <img
            src="/vcf9-ca/06_hu_6614813af0cb3ade.png"alt="certificate enrollment"width="1697"
            height="1025"/>
        </picture></a><figcaption><p>certificate enrollment (click to enlarge)</p></figcaption></figure>
<figure><a href="07.png"><picture><source srcset="/vcf9-ca/07_hu_11a063900456700f.png" type="image/png">
          <img
            src="/vcf9-ca/07_hu_11a063900456700f.png"alt="certificate enrollment ca"width="1102"
            height="678"/>
        </picture></a><figcaption><p>choose CA (click to enlarge)</p></figcaption></figure>
<p>The SDDC now takes care of all further steps. A new certificate is automatically requested from the CA, signed and then exchanged on the relevant system.
Depending on the system, the process takes a few minutes, so it&rsquo;s best to go and make a cup of coffee while you wait, as you can&rsquo;t do anything else at this point.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">VMware Cloud Foundation supports automatic renewal of Transport Layer Security (TLS) certificates for components that have VMCA, Microsoft CA, Open SSL, or self-signed certificates and support non-disruptive certificate updates.</div>
    </aside>
<figure><a href="08.png"><picture><source srcset="/vcf9-ca/08_hu_cbe0d063bc09db16.png" type="image/png">
          <img
            src="/vcf9-ca/08_hu_cbe0d063bc09db16.png"alt="new certificate"width="1280"
            height="1095"/>
        </picture></a><figcaption><p>new certificate (click to enlarge)</p></figcaption></figure>
<p>And (e)voilà, here is our beautiful new certificate (sorry for the bad pun about my employer evoila).
The cool thing is that certificate renewal can now be done with a simple button press. No more filling out annoying requests or interacting with the CA. One button press, a cup of coffee, a little wait, and the job is done – that&rsquo;s how I like it.</p>
<p>But Daniel, you wanted to show us how to do it manually!</p>
<h2 id="extra---the-manual-way">Extra - the manual way</h2>
<p>It all starts as before with a request.
To ensure that the certificate can be easily exchanged with VCF Operations, the request must be executed via Operations and then simply downloaded via the GUI.</p>
<p>With the request, I now create a certificate at my CA and save it as Base 64 encoded.
I now have to import the certificate, together with the the CA certificate.
To do this, the certificate must be PEM encoded. Alternatively, the certificate can also be imported as a certificate chain encoded in Base64. VCF Operations supports both options.</p>
<p>Server Certificate</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">-----BEGIN CERTIFICATE-----
</span></span><span class="line"><span class="cl">MIIDdzCCAl+gAwIBAgIEbmh8+TANBgkq...
</span></span><span class="line"><span class="cl">...dein Server/Leaf-Zertifikat...
</span></span><span class="line"><span class="cl">-----END CERTIFICATE-----
</span></span></code></pre></div><p>Root Certificate</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">-----BEGIN CERTIFICATE-----
</span></span><span class="line"><span class="cl">MIIFQTCCAymgAwIBAgISA9pL1W...
</span></span><span class="line"><span class="cl">...Intermediate CA Zertifikat...
</span></span><span class="line"><span class="cl">-----END CERTIFICATE-----
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">-----BEGIN CERTIFICATE-----
</span></span><span class="line"><span class="cl">MIIDrzCCApegAwIBAgIQCDvgVpF3...
</span></span><span class="line"><span class="cl">...Root CA Zertifikat (optional)...
</span></span><span class="line"><span class="cl">-----END CERTIFICATE-----
</span></span></code></pre></div><figure><a href="09.png"><picture><source srcset="/vcf9-ca/09_hu_617b2736fe27311a.png" type="image/png">
          <img
            src="/vcf9-ca/09_hu_617b2736fe27311a.png"alt="import new certificate"width="1577"
            height="1016"/>
        </picture></a><figcaption><p>import new certificate (click to enlarge)</p></figcaption></figure>
<p>Once the certificate has been successfully imported, it can now be assigned. To do this, select the certificate in VCF Operations that is to be replaced and click on the button <em><strong>Replace with imported certificate</strong></em>.</p>
<figure><a href="10.png"><picture><source srcset="/vcf9-ca/10_hu_99044ae7ed48075d.png" type="image/png">
          <img
            src="/vcf9-ca/10_hu_99044ae7ed48075d.png"alt="replace with imported certificate"width="1577"
            height="912"/>
        </picture></a><figcaption><p>replace with imported certificate (click to enlarge)</p></figcaption></figure>
<p>VCF Operations suggests the certificate. If this does not work, you must select the correct certificate from the drop-down menu. In my case, it worked and the correct certificate was selected.
Push replace and go to the coffee machine.</p>
<p><figure><a href="11.png"><picture><source srcset="/vcf9-ca/11_hu_d95799790909595d.png" type="image/png">
          <img
            src="/vcf9-ca/11_hu_d95799790909595d.png"alt="fresh NSX certificate"width="1510"
            height="1046"/>
        </picture></a><figcaption><p>fresh NSX certificate (click to enlarge)</p></figcaption></figure>
After another cup of coffee, my NSX certificate was successfully exchanged. Damn, I&rsquo;ve had a lot of coffee today, but that&rsquo;s sometimes how it is in IT.</p>
<h2 id="conclusion">Conclusion</h2>
<p>Certificates will still be a pain in the neck in 2025, but VCF Operations has made the process much, much easier.
The ability to easily exchange certificates for both management components and the components of my VCF instances is brilliant.
Unfortunately, there is still no nice built-in feature that allows me to do this on Microsoft without the Web Enrollment Service.</p>
<p>The fact that I can automatically renew my certificates with a single button is also something that makes the whole process much easier for me.
Many of the things just shown were already possible in older VCF versions, but now that everything is centralised in VCF Operations and works for all my VCF instances, it&rsquo;s just great. I will take a look at enrolment with an OpenSSL CA when I have time, but for now I am focusing on the Microsoft CA, as it is really widespread in practice and I am more familiar with the Windows CA than with the OpenSSL CA.</p>
<p>Maybe one more thing:
It is also possible to exchange the certificates of the ESX servers. You just have to activate the <em><strong>Show ESX Hosts</strong></em> slider.</p>
<p>Now that you&rsquo;ve read this far, how painful is certificate exchange for you, and how much coffee have you had? Feel free to let me know on <a href="https://www.linkedin.com/in/daniel-krieger-6476591a9/">LinkedIn</a>.</p>
]]></content>
		</item>
		
		<item>
			<title>VCF 9 - NSX VPC Part 3 - Security</title>
			<link>https://sdn-warrior.org/posts/vcf9-nsx-vpc-part3/</link>
			<pubDate>Tue, 29 Jul 2025 00:10:05 +0200</pubDate>
			
			<guid>https://sdn-warrior.org/posts/vcf9-nsx-vpc-part3/</guid>
			<description><![CDATA[A short article about VPCs in NSX 9 and VCF 9 Part 3.]]></description>
			<content type="html"><![CDATA[<h2 id="introduction">Introduction</h2>
<p>In my two other VPC articles, I showed you the basics and the differences between <a href="https://sdn-warrior.org/posts/vcf9-nsx-vpc/">centralized deployment</a> and <a href="https://sdn-warrior.org/posts/vcf9-nsx-vpc-part2/">distributed deployment</a>.
Today, I would like to write something about security, as VPCs offer various possibilities in this area.
VPCs provide a certain degree of isolation through their private networks, as these cannot be easily routed from outside the VPC.
But what if we want to protect a VM from the public network or a private VM that has been assigned a public IP? This is where the distributed firewall and gateway firewall come into play.
Let&rsquo;s find out exactly how.</p>
<h2 id="nsx-project">NSX Project</h2>
<p>When it comes to isolation, we must also briefly address the topic of NSX Project.
A Project is nothing more than a tenant in NSX. The name is debatable; personally, I don&rsquo;t find it particularly apt.</p>
<figure><a href="01.png"><picture><source srcset="/vcf9-nsx-vpc-3/01_hu_ef7927319b725433.png" type="image/png">
          <img
            src="/vcf9-nsx-vpc-3/01_hu_ef7927319b725433.png"alt="NSX 9 Projects"width="3712"
            height="1681"/>
        </picture></a><figcaption><p>NSX 9 Projects (click to enlarge)</p></figcaption></figure>
<p>There is always the default project, and most of my customers only use that and are happy with it.
However, I don&rsquo;t have any isolation (per default) through the distributed firewall in the default project.
The default rule set initially allows all traffic, and in terms of VPCs, this means that everything that is routable can also be accessed.</p>
<p>But there is a solution. When creating a new project, you can have a default set of rules created that isolates the project from other tenants, including the default tenant.
This can also be changed later. To do this, you must activate <em><strong>Activate Default Distributed Firewall Rules</strong></em> in the project settings.</p>
<p>The default Distributed Firewall rules allow communication between workloads within a Project, including communication with the DHCP server.
All other communication is blocked.</p>
<figure><a href="02.png"><picture><source srcset="/vcf9-nsx-vpc-3/02_hu_85ffcea64c07e0ef.png" type="image/png">
          <img
            src="/vcf9-nsx-vpc-3/02_hu_85ffcea64c07e0ef.png"alt="NSX 9 Projects FW"width="1502"
            height="846"/>
        </picture></a><figcaption><p>NSX 9 Projects FW Rules (click to enlarge)</p></figcaption></figure>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">It is not possible to automatically create such a set of rules for the default project.
The default project will continue to have the familiar default policy and, by default, no traffic will be dropped.</div>
    </aside>
<p>The automatically generated rule set can be edited, expanded, and deactivated, but not deleted.
A dynamic security group is automatically created, which includes all segments of the project. This security group creates an “allow any” rule within the project.</p>
<figure><a href="03.png"><picture><source srcset="/vcf9-nsx-vpc-3/03_hu_ca9fb199ec66e758.png" type="image/png">
          <img
            src="/vcf9-nsx-vpc-3/03_hu_ca9fb199ec66e758.png"alt="NSX 9 Projects FW Group"width="692"
            height="557"/>
        </picture></a><figcaption><p>NSX 9 Projects FW Group (click to enlarge)</p></figcaption></figure>

    <aside class="admonition attention">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" class="feather feather-link" width="24" height="24" viewBox="0 0 24 24"
      fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
      <path d="M10 13a5 5 0 0 0 7.54.54l3-3a5 5 0 0 0-7.07-7.07l-1.72 1.71"></path>
      <path d="M14 11a5 5 0 0 0-7.54-.54l-3 3a5 5 0 0 0 7.07 7.07l1.71-1.71"></path>
   </svg></div><b>Attention</b>
        </div>
        <div class="admonition-content">I would not extend this default set of rules with my own rules, as NSX creates this policy without TCP-Strict.
I will explain why I personally find this somewhat unsightly later in the article.</div>
    </aside>
<p>This completes the initial isolation between tenants, because even if the default tenant has an allow Any rule, the default drop rule of the project will reject the traffic if it comes from outside the tenant.
However, this also means that my public VMs are no longer public, as they can only be accessed within the project.
Internet access is also no longer possible, as all traffic to and from the project is blocked.
I should change this.</p>
<h2 id="distributed-firewall">Distributed Firewall</h2>
<p>This is where the fun begins, and unfortunately we also have to take a look at the role model, because depending on which user role I am logged in with,
I have different options for placing my distributed firewall rules.
And the placement of the firewall rules is important for enforcing the rules or when they are enforced.</p>
<h3 id="nsx-vpc-role-model-broadcom">NSX VPC Role Model (Broadcom)</h3>
<table>
  <thead>
      <tr>
          <th>Role</th>
          <th>Responsibilities</th>
          <th>Scope</th>
          <th>Firewall scope</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><strong>Enterprise Admin</strong></td>
          <td>- Configures N/S connectivity  <!-- raw HTML omitted --> - Creates projects, external IP blocks  Assigns IP blocks and quotas to projects</td>
          <td>Entire platform / tenant</td>
          <td>Entire plattform</td>
      </tr>
      <tr>
          <td><strong>Project Admin</strong></td>
          <td>- Creates VPCs  <!-- raw HTML omitted --> - Assigns IP blocks and quotas to VPCs</td>
          <td>Within assigned project</td>
          <td>only in the project or project vpcs</td>
      </tr>
      <tr>
          <td><strong>VPC Admin</strong></td>
          <td>- Creates subnets from pre-allocated IP blocks  <!-- raw HTML omitted --> - Attaches vNICs to VMs</td>
          <td>Within assigned VPC</td>
          <td>only vpc</td>
      </tr>
  </tbody>
</table>
<p>To better understand the table, it is important to know that rules can be created not only in the <em><strong>Security / Distributed Firewall</strong></em> tab, but also at the VPC level by going to the VPC settings and clicking <em><strong>E-W Firewall Rules</strong></em> for distributed firewall rules or <em><strong>N-S Firewall Rules</strong></em> for gateway firewall rules.</p>
<figure><a href="04.png"><picture><source srcset="/vcf9-nsx-vpc-3/04_hu_3732dea6c6941d89.png" type="image/png">
          <img
            src="/vcf9-nsx-vpc-3/04_hu_3732dea6c6941d89.png"alt="Firewall on VPC level"width="1109"
            height="697"/>
        </picture></a><figcaption><p>Firewall on VPC level (click to enlarge)</p></figcaption></figure>

    <aside class="admonition attention">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" class="feather feather-link" width="24" height="24" viewBox="0 0 24 24"
      fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
      <path d="M10 13a5 5 0 0 0 7.54.54l3-3a5 5 0 0 0-7.07-7.07l-1.72 1.71"></path>
      <path d="M14 11a5 5 0 0 0-7.54-.54l-3 3a5 5 0 0 0 7.07 7.07l1.71-1.71"></path>
   </svg></div><b>Attention</b>
        </div>
        <div class="admonition-content">To use this feature, an additional license is required for both the gateway firewall and the distributed firewall. Unfortunately, VCF 9 does not come with the firewall add-on license.</div>
    </aside>
<h3 id="rule-order">Rule order</h3>
<p>It is important to note that if you create a firewall rule via the VPC and not via <em><strong>Security / Distributed Firewall</strong></em>, it will be created in the application category and will be applied according to the conventionally created Distributed Firewall rules, but will take precedence over the default Layer 3 policy.
This means that the Project Admin or Enterprise Admin can override VPC rules at any time. By default, VPC rules are hidden in the Distributed Firewall. However, you can display the VPC objects in the DFW.</p>
<figure><a href="05.png"><picture><source srcset="/vcf9-nsx-vpc-3/05_hu_eefa72ef4eea0d43.png" type="image/png">
          <img
            src="/vcf9-nsx-vpc-3/05_hu_eefa72ef4eea0d43.png"alt="NSX 9 Firewall rule order"width="1778"
            height="1230"/>
        </picture></a><figcaption><p>NSX 9 FW rule order (click to enlarge)</p></figcaption></figure>
<p>It is important to understand that the categories and apply to are still used to process the sequence.
A deny all project rule in the Environment category takes precedence over a default project rule in the Application category with apply to dfw. However, the project rule is only valid within the project, even if apply to is set to dfw.
A firewall rule from the default project with apply to set to dfw is enforced on all VMs.
If rules from different sources are in the same firewall category, the default project rules are always processed first, then the project-specific rules, and only then the VPC rules – provided that the rules are relevant for the VM (apply to). If a default project rule is not enforced on the VPC VM via apply to, it is of course not processed.
If no rule can be applied, our rule no. 2, also known as the default Layer 3 policy, comes into play. In production environments, this rule should always be a drop all and log rule.</p>
<p>To make this clearer, I have prepared an example. In my default project policy, I block http for all VMs (apply to dfw). In the project-specific policy, I block the https port for all project-specific VMs, and in the policy for VPC3, I allow icmp on all VPC3 VMs.</p>
<figure><a href="06.png"><picture><source srcset="/vcf9-nsx-vpc-3/06_hu_38acd649e0df59b1.png" type="image/png">
          <img
            src="/vcf9-nsx-vpc-3/06_hu_38acd649e0df59b1.png"alt="NSX 9 Firewall rule set"width="1450"
            height="1066"/>
        </picture></a><figcaption><p>NSX 9 FW rule set (click to enlarge)</p></figcaption></figure>
<figure><a href="07.png"><picture><source srcset="/vcf9-nsx-vpc-3/07_hu_431d4da8a24c90e3.png" type="image/png">
          <img
            src="/vcf9-nsx-vpc-3/07_hu_431d4da8a24c90e3.png"alt="NSX 9 VPC Firewall rule set"width="1152"
            height="640"/>
        </picture></a><figcaption><p>NSX 9 VPC FW rule set (click to enlarge)</p></figcaption></figure>
<p>We can also view all of this on the CLI.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">[root@vcf09-m01-esx01:~] vsipioctl getrules -f nic-1056617-eth0-vmware-sfw.2
</span></span><span class="line"><span class="cl">ruleset mainrs {
</span></span><span class="line"><span class="cl">  # generation number: 0
</span></span><span class="line"><span class="cl">  # realization time : 2025-07-27T21:30:00
</span></span><span class="line"><span class="cl">  # FILTER (APP Category) rules
</span></span><span class="line"><span class="cl">  rule 12268 at 1 inout protocol tcp strict from any to any port 80 drop;
</span></span><span class="line"><span class="cl">  rule 12264 at 2 inout protocol tcp strict from any to any port 443 drop;
</span></span><span class="line"><span class="cl">  rule 11240 at 3 inout protocol icmp from any to any accept;
</span></span><span class="line"><span class="cl">  rule 10216 at 4 inout protocol ipv6-icmp icmptype 136 from any to any accept;
</span></span><span class="line"><span class="cl">  rule 10216 at 5 inout protocol ipv6-icmp icmptype 135 from any to any accept;
</span></span><span class="line"><span class="cl">  rule 10217 at 6 inout protocol udp from any to any port {67, 68} accept;
</span></span><span class="line"><span class="cl">  rule 10218 at 7 inout protocol udp from any to any port {546, 547} accept;
</span></span><span class="line"><span class="cl">  rule 10219 at 8 inout protocol any from addrset 7a87cbca-ca80-4042-9507-558bb69b94c8 to addrset 7a87cbca-ca80-4042-9507-558bb69b94c8 accept;
</span></span><span class="line"><span class="cl">  rule 10220 at 9 inout protocol any from any to any drop;
</span></span><span class="line"><span class="cl">  rule 3 at 10 inout inet6 protocol ipv6-icmp icmptype 136 from any to any accept;
</span></span><span class="line"><span class="cl">  rule 3 at 11 inout inet6 protocol ipv6-icmp icmptype 135 from any to any accept;
</span></span><span class="line"><span class="cl">  rule 4 at 12 inout protocol udp from any to any port {67, 68} accept;
</span></span><span class="line"><span class="cl">  rule 2 at 13 inout protocol any from any to any accept;
</span></span><span class="line"><span class="cl">}
</span></span></code></pre></div><p>In the CLI output, you can clearly see the order in which the rules are processed. Here is the cross-check with a VM from VPC4.
You can see that the default project rule and the project rule apply, but the VPC3-specific rule (ID 11240) does not exist.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">[root@vcf09-m01-esx01:~] vsipioctl getrules -f nic-1056658-eth0-vmware-sfw.2
</span></span><span class="line"><span class="cl">ruleset mainrs {
</span></span><span class="line"><span class="cl">  # generation number: 0
</span></span><span class="line"><span class="cl">  # realization time : 2025-07-27T21:36:42
</span></span><span class="line"><span class="cl">  # FILTER (APP Category) rules
</span></span><span class="line"><span class="cl">  rule 12268 at 1 inout protocol tcp strict from any to any port 80 drop;
</span></span><span class="line"><span class="cl">  rule 12264 at 2 inout protocol tcp strict from any to any port 443 drop;
</span></span><span class="line"><span class="cl">  rule 10216 at 3 inout protocol ipv6-icmp icmptype 136 from any to any accept;
</span></span><span class="line"><span class="cl">  rule 10216 at 4 inout protocol ipv6-icmp icmptype 135 from any to any accept;
</span></span><span class="line"><span class="cl">  rule 10217 at 5 inout protocol udp from any to any port {67, 68} accept;
</span></span><span class="line"><span class="cl">  rule 10218 at 6 inout protocol udp from any to any port {546, 547} accept;
</span></span><span class="line"><span class="cl">  rule 10219 at 7 inout protocol any from addrset 7a87cbca-ca80-4042-9507-558bb69b94c8 to addrset 7a87cbca-ca80-4042-9507-558bb69b94c8 accept;
</span></span><span class="line"><span class="cl">  rule 10220 at 8 inout protocol any from any to any drop;
</span></span><span class="line"><span class="cl">  rule 3 at 9 inout inet6 protocol ipv6-icmp icmptype 136 from any to any accept;
</span></span><span class="line"><span class="cl">  rule 3 at 10 inout inet6 protocol ipv6-icmp icmptype 135 from any to any accept;
</span></span><span class="line"><span class="cl">  rule 4 at 11 inout protocol udp from any to any port {67, 68} accept;
</span></span><span class="line"><span class="cl">  rule 2 at 12 inout protocol any from any to any accept;
</span></span><span class="line"><span class="cl">}
</span></span></code></pre></div><h2 id="gateway-firewall">Gateway Firewall</h2>
<p>To use the gateway firewall, it must first be activated for the project.
The gateway firewall is implemented on the VPC gateway for VPCs and not on a T1 or T0 router.
To activate the gateway firewall for a project, you must have at least project admin rights and can then activate it in the <em><strong>VPC Security Profile</strong></em>.
It is then activated project-wide and can be configured by a VPC admin.</p>

    <aside class="admonition attention">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" class="feather feather-link" width="24" height="24" viewBox="0 0 24 24"
      fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
      <path d="M10 13a5 5 0 0 0 7.54.54l3-3a5 5 0 0 0-7.07-7.07l-1.72 1.71"></path>
      <path d="M14 11a5 5 0 0 0-7.54-.54l-3 3a5 5 0 0 0 7.07 7.07l1.71-1.71"></path>
   </svg></div><b>Attention</b>
        </div>
        <div class="admonition-content">If a project admin disables the gateway firewall for the project and rules are still configured in the VPC, these will continue to be displayed as active but will no longer be enforced.
The VPC admin cannot see whether the gateway firewall is still active or not.</div>
    </aside>
<p>The rules are configured in the VPC setup under <em><strong>N-S Firewall Rules</strong></em>.
They are not visible under <em><strong>Security / Gateway Firewall</strong></em>, regardless of the permissions my user has.</p>
<figure><a href="08.png"><picture><source srcset="/vcf9-nsx-vpc-3/08_hu_7d96f9c7edbc1e61.png" type="image/png">
          <img
            src="/vcf9-nsx-vpc-3/08_hu_7d96f9c7edbc1e61.png"alt="NSX 9 VPC Gateway Firewall"width="1150"
            height="544"/>
        </picture></a><figcaption><p>NSX 9 VPC Gateway Firewall (click to enlarge)</p></figcaption></figure>
<p>The configuration itself is very simple and straightforward.
Of course, the gateway firewall does not work with the distributed transit gateway, as edge nodes are required for the gateway firewall.
You also need to pay attention to the order in which they are processed.
The gateway firewall only intervenes when traffic is sent via the VPC gateway to the transit gateway or vice versa.
This means that internal VPC communication cannot be blocked.</p>
<p>For outgoing traffic, the distributed firewall is processed first, followed by the gateway firewall.
For incoming traffic, the gateway firewall is processed first, followed by the distributed firewall.
If the destination is located in another VPC or in another NSX project, the firewall is also processed at the destination as incoming traffic.
If a gateway firewall is configured at T0, the order for outgoing traffic would be distributed firewall, gateway firewall on the VPC gateway, and then gateway firewall on T0, and vice versa for incoming traffic.</p>
<p>The gateway firewall does not offer us as many options as the distributed firewall, and it also requires edges. North-south traffic can also be effectively protected with the distributed firewall, although you may need to give a little more thought to the design of the rule set.</p>
<h2 id="bonus-round---tcp-strict">Bonus Round - TCP Strict</h2>
<p>At the beginning of the article, I wrote that NSX has disabled TCP strict in the default policy.
This is partly because TCP strict cannot work on an any/any rule; at least one TCP service must be specified.
It is also irrelevant for a drop rule. But why is this the case?
TCP Strict is a firewall mechanism that controls the 3-way handshake of TCP.
In short, when the first packet is a SYN packet from the client, the firewall checks whether the subsequent packet is a SYN-ACK from the server.
Only when the final ACK packet from the client arrives is the connection considered established and the firewall allows the user data to pass.
Essentially, the firewall enforces clean TCP communication between the client and server and prevents half-open connections. Asymmetric routing is also blocked.
This can cause problems with NSX in particular if you make the N/S peering of the edges to a firewall (this feature is called anti-spoofing in Checkpoint firewalls, for example).</p>
<p>If you now look at the default firewall rules of NSX for VPCs, you will quickly see that TCP Strict would not make sense in the policy.</p>
<figure><a href="09.png"><picture><source srcset="/vcf9-nsx-vpc-3/09_hu_b2b7ed939f84b3ad.png" type="image/png">
          <img
            src="/vcf9-nsx-vpc-3/09_hu_b2b7ed939f84b3ad.png"alt="NSX default VPC Rules"width="1419"
            height="348"/>
        </picture></a><figcaption><p>NSX 9 default VPC Firewall rules (click to enlarge)</p></figcaption></figure>
<p>This is because TCP Strict would not work with ICMP rules and the default DHCP rules (UDP) anyway, as there is no TCP traffic and TCP Strict does not apply with allow any/any or drop any/any.</p>
<p>No problem, right, Daniel? That&rsquo;s exactly where my criticism comes in. Although you can&rsquo;t delete the policy or enable TCP Strict, you can add as many firewall rules as you like – it&rsquo;s so convenient and I don&rsquo;t need to create a new policy.
The only problem is that the new rules (part of the default policy) don&rsquo;t have TCP Strict either, which weakens my firewall rules even though there&rsquo;s no reason for it.
The problem is that I can attack my server with modified TCP packets. This can happen intentionally or unintentionally (for example, due to a software bug). For example, I can easily bring down a test VM with TCP Ack packets using hping3.
Normally, if TCP Strict is enabled, ACK packets without a preceding and matching SYN packet would simply be discarded by the firewall.</p>
<p>To perform a simple test, I added an Allow HTTPs rule to the default VPC firewall policy (this is the policy without TCP Strict) and bombarded my system with hping3 and TCP ACK packets.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">hping3 -A --flood -p 443 -d 1200 10.28.11.35
</span></span></code></pre></div><p>Parameters:</p>
<ul>
<li>-A  Sends TCP ACK packets (no connection establishment, simulates response traffic)</li>
<li>&ndash;flood Sends packets as fast as possible without waiting for responses</li>
<li>-p 443  Target port: 443 (HTTPS)</li>
<li>-d 1200 Sets the payload size to 1200 bytes</li>
<li>10.28.11.35 Target IP address</li>
</ul>
<figure><a href="10.png"><picture><source srcset="/vcf9-nsx-vpc-3/10_hu_4eeba0a2c6a6f197.png" type="image/png">
          <img
            src="/vcf9-nsx-vpc-3/10_hu_4eeba0a2c6a6f197.png"alt="btop"width="1717"
            height="1131"/>
        </picture></a><figcaption><p>BTOP view of the victims vm (click to enlarge)</p></figcaption></figure>
<p>The result is quite spectacular in BTOP. 100% CPU utilization of the VM and approx. 10 Gb/s network traffic, and everything passes through the firewall to the VM.
To double-check, I create a new firewall policy. NSX always has TCP Strict enabled by default for new policies. The rule will be the same, allow HTTPs, but this time with TCP Strict enabled.</p>
<figure><a href="11.png"><picture><source srcset="/vcf9-nsx-vpc-3/11_hu_9303c3a0725b92fe.png" type="image/png">
          <img
            src="/vcf9-nsx-vpc-3/11_hu_9303c3a0725b92fe.png"alt="hit count"width="418"
            height="451"/>
        </picture></a><figcaption><p>Firewall rule statistic  (click to enlarge)</p></figcaption></figure>
<p>As you can see in the rule statistics, the new rules receive a large number of hits, but since there is no valid TCP handshake, the distributed firewall discards all packets and no packets arrive at the VM.</p>

    <aside class="admonition attention">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" class="feather feather-link" width="24" height="24" viewBox="0 0 24 24"
      fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
      <path d="M10 13a5 5 0 0 0 7.54.54l3-3a5 5 0 0 0-7.07-7.07l-1.72 1.71"></path>
      <path d="M14 11a5 5 0 0 0-7.54-.54l-3 3a5 5 0 0 0 7.07 7.07l1.71-1.71"></path>
   </svg></div><b>Attention</b>
        </div>
        <div class="admonition-content">Just because the firewall rejects the traffic does not mean that you will not encounter problems. In my case, the transport node (ESX) received messages that the enhanced DP flow table usage was very high. However, the ESX host still received all packets, which caused problems in my small nested lab. When I perform the same test with VMs connected to a centralized transit gateway, my edge VMs start to lose their N/S connection because they go directly into overload. However, it should be noted that I have small edge VMs in the lab that are not designed for a data throughput of 10 Gb/s.</div>
    </aside>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">[root@vcf09-m01-esx03:~] vsipioctl getrules -f nic-1164294-eth0-vmware-sfw.2
</span></span><span class="line"><span class="cl">ruleset mainrs {
</span></span><span class="line"><span class="cl">  # generation number: 0
</span></span><span class="line"><span class="cl">  # realization time : 2025-07-28T21:27:00
</span></span><span class="line"><span class="cl">  # FILTER (APP Category) rules
</span></span><span class="line"><span class="cl">  rule 13288 at 1 inout protocol tcp strict from any to any port 443 accept;
</span></span><span class="line"><span class="cl">  rule 13289 at 2 inout protocol tcp from any to any port 443 accept;
</span></span><span class="line"><span class="cl">  rule 5101 at 3 inout protocol ipv6-icmp icmptype 136 from any to any accept;
</span></span><span class="line"><span class="cl">  rule 5101 at 4 inout protocol ipv6-icmp icmptype 135 from any to any accept;
</span></span><span class="line"><span class="cl">  rule 5102 at 5 inout protocol udp from any to any port {67, 68} accept;
</span></span><span class="line"><span class="cl">  rule 5103 at 6 inout protocol udp from any to any port {546, 547} accept;
</span></span><span class="line"><span class="cl">  rule 5104 at 7 inout protocol any from addrset ba52d50f-55b3-453b-bff2-b2dd72beca3c to addrset ba52d50f-55b3-453b-bff2-b2dd72beca3c accept;
</span></span><span class="line"><span class="cl">  rule 5105 at 8 inout protocol any from any to any accept;
</span></span><span class="line"><span class="cl">  rule 3 at 9 inout inet6 protocol ipv6-icmp icmptype 136 from any to any accept;
</span></span><span class="line"><span class="cl">  rule 3 at 10 inout inet6 protocol ipv6-icmp icmptype 135 from any to any accept;
</span></span><span class="line"><span class="cl">  rule 4 at 11 inout protocol udp from any to any port {67, 68} accept;
</span></span><span class="line"><span class="cl">  rule 2 at 12 inout protocol any from any to any accept;
</span></span><span class="line"><span class="cl">}
</span></span></code></pre></div><p>In the CLI, you can also clearly see that rule 13288 uses TCP strict and rule 13289 does not.</p>
<h2 id="what-can-we-learn-from-this">What can we learn from this?</h2>
<p>Firstly, you should think carefully about where and how you write your firewall rules, and secondly, you can use hping3 to disable your lab. :D</p>
<p>Furthermore, TCP Strict does not help with SYN flood attacks, as NSX creates a TCP session for each SYN packet and waits 120 seconds for the SYN-ACK. In the case of a hping3 attack, this would mean that the server would no longer be able to respond, especially since there is a parameter for random source and you end up simply occupying all of the server&rsquo;s ports. Therefore, it may be useful to create a flood protection profile. These can be created separately for each project.
However, you should be familiar with the traffic profile of the environment and, in general, caution is advised with such mechanisms, as overly strict settings can also block desired traffic.</p>
<h2 id="conclusion">Conclusion</h2>
<p>But what can I do? My recommendation would always be to write your own policies for additional rules to the default VPC rule set.
The VPC default rule set is very good for isolating the VPCs from each other, but that&rsquo;s about it. Within the VPC, you should write your own policy, or at least one for traffic in and out of the VPC.</p>
<p>Use a perimeter firewall, as most attacks do not come from within, and learn how to write meaningful rules on the distributed firewall.
Use Apply To, even on VPC rules, even if they are already restricted to the VPC.
I have written about why <a href="https://sdn-warrior.org/posts/nsx-apply-to/">Apply To</a> is useful in another article. This also applies to the VPC topic.</p>
<p>Otherwise, all I can say is get yourself a KALI Linux and test your lab. I noticed the behavior with TCP Strict more by accident. It&rsquo;s not a huge vulnerability, but you should be aware of what defaults and automatically generated rules or settings do. I don&rsquo;t know yet if there will be another article on the topic of VPCs. My esteemed colleague <a href="https://sdn-techtalk.com/posts/vks-vpc/">Steven Schramm</a> has taken a closer look at the topic of VPCs and VKS. It&rsquo;s a very readable article.</p>
<p>So all that&rsquo;s left for me to say is happy pen testing!</p>
]]></content>
		</item>
		
		<item>
			<title>VCF 9 - Migration to central VCF Operations instance</title>
			<link>https://sdn-warrior.org/posts/vcf-operations-migration/</link>
			<pubDate>Mon, 07 Jul 2025 20:00:00 +0200</pubDate>
			
			<guid>https://sdn-warrior.org/posts/vcf-operations-migration/</guid>
			<description><![CDATA[A brief guide on how to switch from a VCF9 Operations instance to a central VCF Operations instance.]]></description>
			<content type="html"><![CDATA[<h2 id="introduction">Introduction</h2>
<p>Where should I start? Perhaps I should describe my starting point.
I deployed a nested lab with VCF9 for the VCF9 beta, so far so good.
This lab came with its own VCF Operations. Now we are GA and I am lucky enough to have 256 core VCF9 licenses.
I wanted to upgrade part of my hardware lab to ESX9 because I also want to test memory tiering in its final version, and that&rsquo;s where my problems start.
In VCF9, there are no more keys and you have to license your products via VCF Operations and vCenter.
This also applies to VVF, or if you want to use ESX9 on your physical lab hardware.</p>
<p>Conclusion: I need a second VCF Operations and would have to split my licenses. However, this is not a good solution because my nested lab is not permanently on, which means that Lab VCF Operations is also not permanently on.</p>
<p>The solution: a central VCF Operations and Fleet Manager.</p>
<h2 id="lets-get-started">Let&rsquo;s get started</h2>
<p>First, I shut down my Nested VCF Operations, as well as the Cloud Proxy and Fleet Manager, because in the best case scenario, I won&rsquo;t need them in my Nested Lab anymore.
Since the licenses are activated online, I have until the end of this year before the next usage report is due. So, for now, nothing will be left inactive.</p>
<p>Then I deploy a new VCF Operations, a Fleet Manager, and a Cloud Proxy.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Cloud Proxy</b>
        </div>
        <div class="admonition-content">The Cloud Proxy is theoretically optional and can only be deployed after my VCF Operations has been successfully deployed and set up. The basic function of the Cloud Proxy can also be taken over by the Operations appliance if, for example, resources are limited.</div>
    </aside>

    <aside class="admonition warning">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-alert-circle">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="8" x2="12" y2="12"></line>
      <line x1="12" y1="16" x2="12.01" y2="16"></line>
   </svg></div><b>Deployment</b>
        </div>
        <div class="admonition-content"><p>Here, I will show you the method that worked best for me.
Of course, you can use the VCF Installer to deploy a new VVF or VCF environment and onboard additional setups into an existing operation.
However, since I did not want to redeploy my physical lab, but instead upgraded from ESXi8 to ESX9 beforehand, this method is not possible.
I also didn&rsquo;t want to onboard my physical lab in the nested VCF operations, as the operations should be continuously available.</p>
<p>I therefore strongly recommend this method only for LAB environments. Don&rsquo;t sue me if your environment crashes. :D So you have been warned.</p>
</div>
    </aside>
<p>The actual deployment is done using standard OVAs that you download from the Broadcom portal beforehand. So far, so unspectacular.
I won&rsquo;t go into detail about the basic setup of the individual appliances, as this is fairly straightforward.
I also won&rsquo;t explain how to integrate a cloud proxy, as this is fairly self-explanatory and can be done via Operations under <em><strong>Administration -&gt; Cloud Proxies</strong></em>.</p>
<figure><a href="01.png"><picture><source srcset="/vcf-operations-migration/01_hu_edd7de637b2abe66.png" type="image/png">
          <img
            src="/vcf-operations-migration/01_hu_edd7de637b2abe66.png"alt="Fleet Manager"width="2016"
            height="1394"/>
        </picture></a><figcaption><p>Fleet Manager OVA (click to enlarge)</p></figcaption></figure>
<p>Perhaps one more note: the fleet manager OVA is called  VCF-OPS-Lifecycle-Manager-Appliance.</p>
<p>Now that we have deployed our VCF Operation and Fleet Manager, we need to bring the VCF Operation Cluster online and connect our Fleet Manager.
To do this, open the Operations admin interface using the following URL: https://&lt; VCFOPS FQDN &gt;/admin</p>
<figure><a href="02.png"><picture><source srcset="/vcf-operations-migration/02_hu_2cd03a6ad5011b47.png" type="image/png">
          <img
            src="/vcf-operations-migration/02_hu_2cd03a6ad5011b47.png"alt="VCF OPS Admin"width="1726"
            height="878"/>
        </picture></a><figcaption><p>VCF Operations Admin Interface (click to enlarge)</p></figcaption></figure>
<p>This is where the “cluster” is put into operation.
The process is straightforward. The operating system is also updated if necessary, provided that an Internet connection is available.</p>
<p>Since we deployed VCF Operations manually, no Fleet Manager is connected yet, of course.</p>
<figure><a href="03.png"><picture><source srcset="/vcf-operations-migration/03_hu_30210f74da8fa062.png" type="image/png">
          <img
            src="/vcf-operations-migration/03_hu_30210f74da8fa062.png"alt="VCF Fleet"width="1723"
            height="982"/>
        </picture></a><figcaption><p>Fleet Management onboarding (click to enlarge)</p></figcaption></figure>
<p>Here you need the fleet manager FQDN and the admin password. In addition, the admin password for VCF Operations is required; without it, the process cannot be completed.</p>
<p>The admin password must comply with the new complexity rules.
The days of <em><strong>VMware1!</strong></em> are over. If a password that is too weak was set during deployment, onboarding will now fail.
Okay, I could have written that earlier, but you are warned about this during OVA deployment, and I&rsquo;m sure my readers would never just click “continue” without thinking – not that this would have happened to the author here.</p>
<figure><a href="04.png"><picture><source srcset="/vcf-operations-migration/04_hu_4bf037df2246e76.png" type="image/png">
          <img
            src="/vcf-operations-migration/04_hu_4bf037df2246e76.png"alt="VCF Fleet sucess"width="1727"
            height="900"/>
        </picture></a><figcaption><p>Fleet Management onboarding successfull (click to enlarge)</p></figcaption></figure>
<p>Of course, I was able to connect to the Fleet Manager right away the first time—at least that&rsquo;s what I&rsquo;m going to say. The whole process takes a little time, so it&rsquo;s best to have a cup of coffee ready.</p>
<p>When the Fleet Manager is connected, a VCF operation instance with Pending Import is now displayed in Operations under Fleet <em><strong>Management -&gt; Lifecycle</strong></em>.</p>
<figure><a href="05.png"><picture><source srcset="/vcf-operations-migration/05_hu_a36c0a1a72d7b5c7.png" type="image/png">
          <img
            src="/vcf-operations-migration/05_hu_a36c0a1a72d7b5c7.png"alt="Pending Operations"width="1726"
            height="903"/>
        </picture></a><figcaption><p>Pending VCF Operations (click to enlarge)</p></figcaption></figure>
<p>This should not cause us any concern and is completely normal. Before we can continue here, we need to onboard our future environments so that the Fleet Manager can do its job.</p>
<h2 id="onboarding-of-vcf-and-other-esx9-environments">Onboarding of VCF and other ESX9 environments</h2>
<p>At this point, it would also be possible to deploy a cloud proxy, but this is not necessary.
To continue, my nested LAB and my ESX 9 servers must be integrated.
This is done via <em><strong>Administration -&gt; Integrations</strong></em> Here, you specify the accounts you want to onboard.
For my ESX 9 servers, I select vCenter as the account type and specify the appropriate vCenter 9 and account details.</p>
<p>For my VCF configuration, I select vSphere Cloud Foundation for VCF.
You need the login details for the SDDC Manager and not the vCenter.
It is recommended to create separate credentials in Operations, even if you use the same password multiple times.
I decided to enable “Domain Monitoring when creating” under Advanced Settings.</p>
<figure><a href="06.png"><picture><source srcset="/vcf-operations-migration/06_hu_b1266a9719366d31.png" type="image/png">
          <img
            src="/vcf-operations-migration/06_hu_b1266a9719366d31.png"alt="Account Integration"width="1724"
            height="898"/>
        </picture></a><figcaption><p>Account Integration (click to enlarge)</p></figcaption></figure>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Information</b>
        </div>
        <div class="admonition-content">By default, this setting is set to false. If you configure a VCF account with the default setting, data collection and monitoring of domains is disabled. If you edit the VCF account and change this setting to true, data collection and monitoring is enabled but it is done only for the newly discovered domains.</div>
    </aside>
<p>Under <em><strong>Collector / Group</strong></em>, you can select the collector you want to use for collecting data. The default collector group contains the VCF operations.</p>
<figure><a href="07.png"><picture><source srcset="/vcf-operations-migration/07_hu_a000a27448596e27.png" type="image/png">
          <img
            src="/vcf-operations-migration/07_hu_a000a27448596e27.png"alt="Collector"width="1725"
            height="900"/>
        </picture></a><figcaption><p>Collecting Status (click to enlarge)</p></figcaption></figure>
<p>After successfully adding our accounts, data collection must also be activated.</p>

    <aside class="admonition danger">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-alert-triangle">
      <path d="M10.29 3.86L1.82 18a2 2 0 0 0 1.71 3h16.94a2 2 0 0 0 1.71-3L13.71 3.86a2 2 0 0 0-3.42 0z"></path>
      <line x1="12" y1="9" x2="12" y2="13"></line>
      <line x1="12" y1="17" x2="12.01" y2="17"></line>
   </svg></div><b>Attention</b>
        </div>
        <div class="admonition-content">It&rsquo;s easy to overlook, but if you want to integrate the VCF environment so that you can manage and license it rather than just monitor it, you need to click the Activate Management button in this dialog box for the vCenter. The process takes about 15 minutes. If you don&rsquo;t do this, licensing will not be possible later. The settings can be changed at any time.</div>
    </aside>
<figure><a href="08.png"><picture><source srcset="/vcf-operations-migration/08_hu_b2ce45f4680a7125.png" type="image/png">
          <img
            src="/vcf-operations-migration/08_hu_b2ce45f4680a7125.png"alt="License"width="1229"
            height="794"/>
        </picture></a><figcaption><p>vSphere Client Plugin (click to enlarge)</p></figcaption></figure>
<h2 id="licensing">Licensing</h2>
<p>You must register the VCF Operations instance that you want to use for license management in the VCF Business Services console.
There are two ways to do this: Connected and Disconnected.</p>
<ul>
<li>Connected Mode</li>
</ul>
<p>If the environment is connected to the Internet, you can register your VCF Operations instance in connected mode. Connected mode provides a faster and simplified registration process and makes it easier to update your licenses. In addition, usage is automatically uploaded every night.
The exact process for registering the VCF operation in connected mode can be found <a href="https://techdocs.broadcom.com/us/en/vmware-cis/vcf/vcf-9-0-and-later/9-0/licensing/register-vcf-operations/register-vcf-operation-in.html">here</a>.
However, the entire process is guided very well by the GUI. The license must be updated every 6 months or when a subscription expires. In Connected mode, this can be done with a single click.</p>
<ul>
<li>Disconnected Mode</li>
</ul>
<p>Disconnected mode is for all users who have a dark site and therefore no internet access for VCF operations. Onboarding is a little more complicated because license files and usage reports have to be uploaded and downloaded manually. Even in disconnected mode, you have to create and upload usage reports regularly. The license must be updated at least every 6 months or when a subscription expires.
Instructions for disconnected mode can be found <a href="https://techdocs.broadcom.com/us/en/vmware-cis/vcf/vcf-9-0-and-later/9-0/licensing/register-vcf-operations/register-vcf-operations-in-disconnected-mode.html">here</a>. However, the process is really very simple and very well guided via the GUI.</p>
<h3 id="assigning-licenses">Assigning licenses</h3>
<p>After 15 minutes, the previously added vCenter will be displayed in VCF Operations. Under <em><strong>License Management – Licenses</strong></em>, i should now see my vCenter and VVF or VCF licenses that I previously assigned to my operations via the VCF Business Service Console.
Because we have deployed a central VCF Operations, the licenses do not need to be split.
Instead, I can assign the same license to all of my added vCenter9 instances, as long as there are enough cores available in my license.
To do this, I select the vCenter instances and assign a primary license.
This license is then imported into vCenter and used to license all other VCF products, such as NSX.
vSAN must be assigned to the vCenters as a so-called add-on license.</p>
<figure><a href="12.png"><picture><source srcset="/vcf-operations-migration/12_hu_b225deb4bd5247cf.png" type="image/png">
          <img
            src="/vcf-operations-migration/12_hu_b225deb4bd5247cf.png"alt="License"width="1723"
            height="895"/>
        </picture></a><figcaption><p>VCF License (click to enlarge)</p></figcaption></figure>
<p>Now all that&rsquo;s missing is the VCF lifecycle. Remember, we still had one point left to cover.</p>
<h2 id="vcf-lifecycle">VCF Lifecycle</h2>
<p>We still had our fleet manager who hadn&rsquo;t been fully onboarded yet.
First, we need to create deployment targets. These can be found under <em><strong>Fleet Management -&gt; Lifecycle -&gt; VCF Management -&gt; Deployment Targets</strong></em>.</p>
<figure><a href="09.png"><picture><source srcset="/vcf-operations-migration/09_hu_32a498b7c15eb3a1.png" type="image/png">
          <img
            src="/vcf-operations-migration/09_hu_32a498b7c15eb3a1.png"alt="Deployment targets"width="1726"
            height="896"/>
        </picture></a><figcaption><p>VCF License (click to enlarge)</p></figcaption></figure>
<p>The deployment target is important so that the Lifecycle Manager can deploy components.
Here, too, the vCenter is selected and the correct account must be specified. If no suitable account exists, a new one can be created from the dialog box.
The connection is then validated and the deployment target is configured.</p>

    <aside class="admonition Tip">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" class="feather feather-link" width="24" height="24" viewBox="0 0 24 24"
      fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
      <path d="M10 13a5 5 0 0 0 7.54.54l3-3a5 5 0 0 0-7.07-7.07l-1.72 1.71"></path>
      <path d="M14 11a5 5 0 0 0-7.54-.54l-3 3a5 5 0 0 0 7.07 7.07l1.71-1.71"></path>
   </svg></div><b>Tip</b>
        </div>
        <div class="admonition-content">As a deployment target, you need to add every vCenter where you want to deploy stuff or where VCF Operations components are running. In my case, my VCF Operations, Cloud Proxy, and Fleet Manager are running on my management host, which is still in my vSphere 8 environment. That&rsquo;s why this environment also needs to be saved as a deployment target, otherwise the next steps won&rsquo;t work.</div>
    </aside>
<p>Finally, we take care of our operation that is still displayed as Pending Import. To do this, we click on the Import button under <em><strong>Fleet Management -&gt; Lifecycle -&gt; VCF Management -&gt; Overview for VCF Operations</strong></em>.
<figure><a href="10.png"><picture><source srcset="/vcf-operations-migration/10_hu_d17614a7dedb8fd.png" type="image/png">
          <img
            src="/vcf-operations-migration/10_hu_d17614a7dedb8fd.png"alt="Import"width="1713"
            height="825"/>
        </picture></a><figcaption><p>Import Operations  (click to enlarge)</p></figcaption></figure></p>
<p>Most of it is already preselected; you just need to enter the root password for the vCenter instance where my components are located (VCF Operations and Cloud Proxies). After confirming with Next, you need to do the same for the Cloud Proxy.</p>

    <aside class="admonition warning">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-alert-circle">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="8" x2="12" y2="12"></line>
      <line x1="12" y1="16" x2="12.01" y2="16"></line>
   </svg></div><b>Cloud Proxy</b>
        </div>
        <div class="admonition-content">SSH must be enabled on the cloud proxy (if one is used). If this was forgotten during deployment, it can be easily enabled via the VM console. To do this, log in with root and enter systemctl start sshd and systemctl enable sshd.</div>
    </aside>
<p>Congratulations, our deployment should now look like this.</p>
<figure><a href="11.png"><picture><source srcset="/vcf-operations-migration/11_hu_ccd6fd1068272b11.png" type="image/png">
          <img
            src="/vcf-operations-migration/11_hu_ccd6fd1068272b11.png"alt="Import"width="1728"
            height="892"/>
        </picture></a><figcaption><p>Import Operations  (click to enlarge)</p></figcaption></figure>
<p>We are now able to deploy an operation for logs via the Fleet Manager.
It is important that you have configured your Broadcom token under Depot Configuration or, if you have an offline repo, that this is configured.</p>
<p>But there are also a few limitations:</p>
<ul>
<li>VCF Management supports only one instance each of VCF Automation, VCF Operations, VCF Operations for logs, or VCF Operations for networks. All components integrate with VCF Operations.</li>
<li>VCF Management supports multiple instances of VCF Identity Broker. For each VCF instance, VCF Management allows you to deploy one and only one VCF Identity Broker instance.</li>
<li>You deploy VCF Identity Broker into either an embedded form factor such as vCenter or an external form factor such as a VCF appliance.</li>
<li>The first VCF Automation or VCF Operations for networks instance that you import or deploy will be in integrated mode.</li>
</ul>
<h2 id="resource-requirements">Resource requirements</h2>
<table>
  <thead>
      <tr>
          <th>Function</th>
          <th>vCPUs</th>
          <th>RAM (Configured)</th>
          <th>Provisioned Disk</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>VCF Operations</td>
          <td>4</td>
          <td>16 GB</td>
          <td>158.18 GB</td>
      </tr>
      <tr>
          <td>Cloud Proxy</td>
          <td>2</td>
          <td>8 GB</td>
          <td>150.20 GB</td>
      </tr>
      <tr>
          <td>Fleet</td>
          <td>4</td>
          <td>12 GB</td>
          <td>205.43 GB</td>
      </tr>
  </tbody>
</table>
<p>I&rsquo;m pretty sure that the required resources can still be adjusted in the lab. Especially when it comes to RAM, you should easily be able to get by with half, but I haven&rsquo;t been able to test it properly yet.</p>
<h2 id="summary">Summary</h2>
<p>Manual deployment is not as intuitive and simple as it is with the VCF installer. However, this setup allows me to easily control and manage my licenses via VCF Operations, which is particularly useful in an environment with nested Labs. For a production environment, I would prefer deployment in a VCF Management Domain.
The components can still be shared by multiple instances if the security concept allows it. I think the blog has shown that import is also possible at a later date.</p>
]]></content>
		</item>
		
		<item>
			<title>VCF 9 - NSX VPC Part 2 - distributed Transit Gateway</title>
			<link>https://sdn-warrior.org/posts/vcf9-nsx-vpc-part2/</link>
			<pubDate>Thu, 26 Jun 2025 19:00:00 +0200</pubDate>
			
			<guid>https://sdn-warrior.org/posts/vcf9-nsx-vpc-part2/</guid>
			<description><![CDATA[A short article about VPCs in NSX 9 and VCF 9 Part 2.]]></description>
			<content type="html"><![CDATA[<h2 id="introduction">Introduction</h2>
<p>This is the second article in a series of articles (number of articles unknown) about VPCs in VCF 9.
If you are not familiar with VPCs and have not yet come across this topic, I recommend reading <a href="https://sdn-warrior.org/posts/vcf9-nsx-vpc/">Part 1</a> first.
I will not go into all the basics again in this article, as these were already explained in the first article.
In this article, I will look at the distributed deployment of VPCs and show how VPCs can also be used without an edge cluster.</p>
<p>Wait, did he really just write VPCs without Edge Clusters?! Sounds good, right?</p>
<h2 id="lets-go---nsx-project">Let&rsquo;s go - NSX project</h2>
<p>First things first.
Since I am using the same VCF 9 installation as in my first article and we already have VPCs in centralized deployment here, I first create a new NSX project.
Perhaps I should mention NSX Projects again here. VMware writes in the VCF 9 Guide:
<em><strong>A project in NSX is analogous to a tenant. By creating projects, you can isolate security and networking objects across tenants in a single NSX deployment.</strong></em></p>
<p>I think that description fits quite well.
In addition, we can currently only have one transit gateway per project.
Since our default tenant (aka default project) has a transit gateway in centralized deployment, I need a new tenant aka project.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">A small addition: whenever I refer to “tenant” in this article, I actually mean NSX project, but I find the name NSX project a bit cumbersome.</div>
    </aside>
<p>To create a new tenant, click on the Project drop-down menu in the NSX GUI and then click on Manage.</p>
<figure><a href="01.png"><picture><source srcset="/vcf9-nsx-vpc-2/01_hu_14d967522d2c4386.png" type="image/png">
          <img
            src="/vcf9-nsx-vpc-2/01_hu_14d967522d2c4386.png"alt="NSX 9 VPC Manhatten"width="1151"
            height="1120"/>
        </picture></a><figcaption><p>NSX 9 project Manhatten (click to enlarge)</p></figcaption></figure>
<p>The most important setting besides the name (yes, I recently watched Oppenheimer) is the External Connection.
These settings determine whether the transit gateway is distributed or centralized.
The External IP Block defines the public VPC network (I explained the different networks in VPCs in <a href="https://sdn-warrior.org/posts/vcf9-nsx-vpc/">Part 1</a>).
We have the option of connecting the project to a T0 gateway independently of the transit gateway.
However, this is only relevant for non-VPC networks and will not be considered here.
The same applies to the option of assigning an edge cluster to the tenant.
Since I really want to avoid edges entirely in this article, we will also ignore this option.
We also have the option of setting distributed firewall rules directly, which only allows communication within the tenant, but since firewalling will be covered in another article, we will ignore this option and disable it.</p>
<p>
    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">In the NSX project create dialog, you can create a new external connection and external IP blocks using the three-dot menu.</div>
    </aside>
We save our Manhattan project and now we have a fresh new tenant.</p>
<h2 id="external-connection---the-distributed-version">External Connection - the distributed version</h2>
<p>We should take another closer look at the topic of external connections.</p>
<figure><a href="02.png"><picture><source srcset="/vcf9-nsx-vpc-2/02_hu_99870950412e2e29.png" type="image/png">
          <img
            src="/vcf9-nsx-vpc-2/02_hu_99870950412e2e29.png"alt="Manhatten External Connection"width="1669"
            height="826"/>
        </picture></a><figcaption><p>NSX 9 project Manhatten external connection (click to enlarge)</p></figcaption></figure>
<p>Since we have a distributed connection here, we need a VLAN and a gateway in my physical network.
In my lab, it is VLAN 1011 and my Mikrotik Core Router has 10.28.11.1/24. My external IP block also comes from the network area of VLAN 1011, as my physical router has to route to this network.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">NSX 9 currently only supports static routing, and if I had used a different external IP block, I would have had to set up additional routes on my Mikrotik – laziness wins out in this case.</div>
    </aside>
<h2 id="lets-build-a-vpc-or-two-or">Let&rsquo;s build a VPC, or two, or&hellip;</h2>
<p>Readers of <a href="https://sdn-warrior.org/posts/vcf9-nsx-vpc/">Part 1</a> should now be quite familiar with all of this.</p>
<figure><a href="03.png"><picture><source srcset="/vcf9-nsx-vpc-2/03_hu_86519d7355b83622.png" type="image/png">
          <img
            src="/vcf9-nsx-vpc-2/03_hu_86519d7355b83622.png"alt="VCP"width="1715"
            height="1211"/>
        </picture></a><figcaption><p>get started with VPC (click to enlarge)</p></figcaption></figure>
<p>First, we set up our private transit gateway IP block in the VPC Connectivity Profile or create one. Remember, this is an unrouted network, but it must not overlap with the other private VPC networks for all tenants.
This network is used to connect VPCs to each other and is an NSX overlay network.
Next, I configure the service profile, which is where I can configure a DHCP service for my VPC, and finally I create the private PVC IP CIDRs.
As in my other article, this will again be 192.168.5.0/24.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">The private VPC IP CIDRs may overlap with other private VPC networks, as they are not routed between VPCs.</div>
    </aside>
<p>That&rsquo;s it, our VPC has been created and we can now create our first VPC subnets. I will create a public, a private, and a transit network, each with 16 IPs and DHCP enabled.
In addition, I will create the VPC Core in the same way. I hope the Demon Core doesn&rsquo;t get too close - nudge nudge wink wink.</p>
<p>The tenants&rsquo; VPC networks are displayed in vCenter, but they are managed by NSX. This means that they cannot be extended or created via vCenter as in our default project.</p>
<h2 id="test-architecture">Test architecture</h2>
<p>I created six test VMs and assigned one VM to each VPC network. For a quick overview, here is a table showing all VMs.</p>
<table>
  <thead>
      <tr>
          <th>VM Name</th>
          <th>Network Type</th>
          <th>Gateway</th>
          <th>IP Address</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>alpine01-demon-private</td>
          <td>private</td>
          <td>192.168.5.1/28</td>
          <td>192.168.5.3</td>
      </tr>
      <tr>
          <td>alpine02-demon-transit</td>
          <td>transit</td>
          <td>10.28.22.1/28</td>
          <td>10.28.22.3</td>
      </tr>
      <tr>
          <td>alpine03-demon-public</td>
          <td>public</td>
          <td>10.28.11.17/28</td>
          <td>10.28.11.19</td>
      </tr>
      <tr>
          <td>alpine04-core-private</td>
          <td>private</td>
          <td>192.168.5.1/28</td>
          <td>192.168.5.3</td>
      </tr>
      <tr>
          <td>alpine05-core-transit</td>
          <td>transit</td>
          <td>10.28.22.17/28</td>
          <td>10.28.22.19</td>
      </tr>
      <tr>
          <td>alpine06-core-public</td>
          <td>public</td>
          <td>10.28.11.33/28</td>
          <td>10.28.11.35</td>
      </tr>
  </tbody>
</table>
<p>The final VPC structure of the Manhattan tenant looks like this:</p>
<figure><a href="04.png"><picture><source srcset="/vcf9-nsx-vpc-2/04_hu_b140b65a3b2ea0da.png" type="image/png">
          <img
            src="/vcf9-nsx-vpc-2/04_hu_b140b65a3b2ea0da.png"alt="VCP Manhatten"width="1560"
            height="916"/>
        </picture></a><figcaption><p>VPCs tenant Manhatten (click to enlarge)</p></figcaption></figure>
<p>We can also test the interconnectivity between the VPCs from <a href="https://sdn-warrior.org/posts/vcf9-nsx-vpc/">Part 1</a>. The overall network topology looks like this:</p>
<figure><a href="05.png"><picture><source srcset="/vcf9-nsx-vpc-2/05_hu_c3b4a23929de4d86.png" type="image/png">
          <img
            src="/vcf9-nsx-vpc-2/05_hu_c3b4a23929de4d86.png"alt="VCP all projects"width="2922"
            height="1002"/>
        </picture></a><figcaption><p>VPCs of all tenants (click to enlarge)</p></figcaption></figure>
<p>Here are the test VMs from the first article:</p>
<table>
  <thead>
      <tr>
          <th>VM Name</th>
          <th>Network Type</th>
          <th>Gateway</th>
          <th>IP Adress</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>alpine01-blue-private</td>
          <td>private</td>
          <td>192.168.5.1/28</td>
          <td>192.168.5.3</td>
      </tr>
      <tr>
          <td>alpine02-blue-transit</td>
          <td>transit</td>
          <td>10.28.12.1/28</td>
          <td>10.28.12.3</td>
      </tr>
      <tr>
          <td>alpine03-blue-public</td>
          <td>public</td>
          <td>192.168.72.17/28</td>
          <td>192.168.72.19</td>
      </tr>
      <tr>
          <td>alpine04-red-private</td>
          <td>private</td>
          <td>192.168.5.1/28</td>
          <td>192.168.5.4</td>
      </tr>
      <tr>
          <td>alpine05-red-transit</td>
          <td>transit</td>
          <td>10.28.12.17/28</td>
          <td>10.28.12.19</td>
      </tr>
      <tr>
          <td>alpine06-red-public</td>
          <td>public</td>
          <td>192.168.72.33/28</td>
          <td>192.168.72.35</td>
      </tr>
  </tbody>
</table>
<h2 id="test-scenarios">Test scenarios</h2>
<p>I will perform certain tests for each VM from VPC-Demon. First, I will test Internet connectivity, then intra-VPC connectivity, i.e., communication with all VMs in VPC-Demon, and finally inter-VPC connectivity, i.e., communication with all VMs in VPC-Core.
Finally, I will perform a connectivity test from outside the NSX environment and a connection test to the VMs from VPC-BLUE.
I will do all of this with simple ICMP tests. To do this, all firewalls on NSX and the VMs have been disabled.
That&rsquo;s a lot of tests, so let&rsquo;s get started.</p>
<h2 id="alpine01-demon-private">alpine01-demon-private</h2>
<p>I will describe the tests using examples. For the other VMSs, I will present the results in a table.</p>
<h3 id="internet-connectivity">Internet connectivity</h3>
<p>In my first test, I want to see where alpine01-demon-private can communicate.
The VM was assigned the IP address 192.168.5.3 by DHCP.
Since our transit gateway was created in distributed mode, we cannot use auto SNAT and generally have no way of using stateful services.
This means that the VM cannot communicate with the internet because the VPC private network is not routed and no SNAT is performed in the public network.</p>
<h3 id="intra-vpc-connectivity">Intra VPC connectivity</h3>
<p>Next, I test the connectivity to the alpine02-demon-transit vm. This test is also successful, as the transit VM can also be accessed via the VPC gateway.
I also reach alpine03-demon-public vm. This happens fully distributed. Cool.</p>
<h3 id="inter-vpc-connectivity">Inter VPC connectivity</h3>
<p>This is where it gets exciting. Basically, I can only access VMs from the same VPC here.
If I had <em><strong>stateful SNAT</strong></em>, I could also access the public VMs of other VPCs belonging to other tenants or VMs of other VPCs in the transit network of my own tenant (provided the firewall allows this).
Since we do not have NAT, a VM from a private VPC network cannot establish an intra-VPC connection.</p>
<h3 id="external-connectivity">External connectivity</h3>
<p>The VM cannot be accessed externally. My core router also does not have a route for the 192.168.5.0/28 subnet.
Even if my router has a route for this network, a VM in the private VPC network cannot communicate with the outside world.
There is a special case, but I will deal with that later.</p>
<table>
  <thead>
      <tr>
          <th>Test #</th>
          <th>Source</th>
          <th>Destination</th>
          <th>Connect</th>
          <th>NAT</th>
          <th>Dist. Routing</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>1</td>
          <td>demon-private</td>
          <td>Internet</td>
          <td>No</td>
          <td>-</td>
          <td>-</td>
      </tr>
      <tr>
          <td>2</td>
          <td>demon-private</td>
          <td>demon-transit</td>
          <td>Yes</td>
          <td>-</td>
          <td>Yes</td>
      </tr>
      <tr>
          <td>3</td>
          <td>demon-private</td>
          <td>demon-public</td>
          <td>Yes</td>
          <td>-</td>
          <td>Yes</td>
      </tr>
      <tr>
          <td>4</td>
          <td>demon-private</td>
          <td>core-private</td>
          <td>No</td>
          <td>-</td>
          <td>-</td>
      </tr>
      <tr>
          <td>5</td>
          <td>demon-private</td>
          <td>core-transit</td>
          <td>No</td>
          <td>-</td>
          <td>-</td>
      </tr>
      <tr>
          <td>6</td>
          <td>demon-private</td>
          <td>core-public</td>
          <td>No</td>
          <td>-</td>
          <td>-</td>
      </tr>
      <tr>
          <td>7</td>
          <td>External</td>
          <td>demon-private</td>
          <td>No</td>
          <td>-</td>
          <td>-</td>
      </tr>
      <tr>
          <td>8</td>
          <td>demon-private</td>
          <td>blue-private</td>
          <td>No</td>
          <td>-</td>
          <td>-</td>
      </tr>
      <tr>
          <td>9</td>
          <td>demon-private</td>
          <td>blue-transit</td>
          <td>No</td>
          <td>-</td>
          <td>-</td>
      </tr>
      <tr>
          <td>10</td>
          <td>demon-private</td>
          <td>blue-public</td>
          <td>No</td>
          <td>-</td>
          <td>-</td>
      </tr>
  </tbody>
</table>
<h2 id="alpine02-demon-transit">alpine02-demon-transit</h2>
<table>
  <thead>
      <tr>
          <th>Test #</th>
          <th>Source</th>
          <th>Destination</th>
          <th>Connect</th>
          <th>NAT</th>
          <th>Dist. Routing</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>1</td>
          <td>demon-transit</td>
          <td>Internet</td>
          <td>No</td>
          <td>-</td>
          <td>-</td>
      </tr>
      <tr>
          <td>2</td>
          <td>demon-transit</td>
          <td>demon-private</td>
          <td>Yes</td>
          <td>-</td>
          <td>Yes</td>
      </tr>
      <tr>
          <td>3</td>
          <td>demon-transit</td>
          <td>demon-public</td>
          <td>Yes</td>
          <td>-</td>
          <td>Yes</td>
      </tr>
      <tr>
          <td>4</td>
          <td>demon-transit</td>
          <td>core-private</td>
          <td>No</td>
          <td>-</td>
          <td>-</td>
      </tr>
      <tr>
          <td>5</td>
          <td>demon-transit</td>
          <td>core-transit</td>
          <td>Yes</td>
          <td>-</td>
          <td>Yes</td>
      </tr>
      <tr>
          <td>6</td>
          <td>demon-transit</td>
          <td>core-public</td>
          <td>Yes</td>
          <td>-</td>
          <td>Yes</td>
      </tr>
      <tr>
          <td>7</td>
          <td>External</td>
          <td>demon-transit</td>
          <td>No</td>
          <td>-</td>
          <td>-</td>
      </tr>
      <tr>
          <td>8</td>
          <td>demon-transit</td>
          <td>blue-private</td>
          <td>No</td>
          <td>-</td>
          <td>-</td>
      </tr>
      <tr>
          <td>9</td>
          <td>demon-transit</td>
          <td>blue-transit</td>
          <td>No</td>
          <td>-</td>
          <td>-</td>
      </tr>
      <tr>
          <td>10</td>
          <td>demon-transit</td>
          <td>blue-public</td>
          <td>No</td>
          <td>-</td>
          <td>-</td>
      </tr>
  </tbody>
</table>
<h2 id="alpine03-demon-public">alpine03-demon-public</h2>
<table>
  <thead>
      <tr>
          <th>Test #</th>
          <th>Source</th>
          <th>Destination</th>
          <th>Connect</th>
          <th>NAT</th>
          <th>Dist. Routing</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>1</td>
          <td>demon-public</td>
          <td>Internet</td>
          <td>Yes</td>
          <td>-</td>
          <td>Yes</td>
      </tr>
      <tr>
          <td>2</td>
          <td>demon-public</td>
          <td>demon-private</td>
          <td>Yes</td>
          <td>-</td>
          <td>Yes</td>
      </tr>
      <tr>
          <td>3</td>
          <td>demon-public</td>
          <td>demon-transit</td>
          <td>Yes</td>
          <td>-</td>
          <td>Yes</td>
      </tr>
      <tr>
          <td>4</td>
          <td>demon-public</td>
          <td>core-private</td>
          <td>No</td>
          <td>-</td>
          <td>-</td>
      </tr>
      <tr>
          <td>5</td>
          <td>demon-public</td>
          <td>core-transit</td>
          <td>Yes</td>
          <td>-</td>
          <td>Yes</td>
      </tr>
      <tr>
          <td>6</td>
          <td>demon-public</td>
          <td>core-public</td>
          <td>Yes</td>
          <td>-</td>
          <td>Yes</td>
      </tr>
      <tr>
          <td>7</td>
          <td>External</td>
          <td>demon-public</td>
          <td>Yes</td>
          <td>-</td>
          <td>Yes</td>
      </tr>
      <tr>
          <td>8</td>
          <td>demon-public</td>
          <td>blue-private</td>
          <td>No</td>
          <td>-</td>
          <td>-</td>
      </tr>
      <tr>
          <td>9</td>
          <td>demon-public</td>
          <td>blue-transit</td>
          <td>No</td>
          <td>-</td>
          <td>-</td>
      </tr>
      <tr>
          <td>10</td>
          <td>demon-public</td>
          <td>blue-public</td>
          <td>Yes</td>
          <td>-</td>
          <td>No</td>
      </tr>
  </tbody>
</table>
<p>That&rsquo;s a lot of connections. Why is that? Technically speaking, the VM shoud be located in VLAN 1011. But is that really true? Let&rsquo;s take a closer look.</p>
<h2 id="the-magic-of-the-distributed-transit-gateway">The magic of the distributed transit gateway</h2>
<p>When we look at a traceflow of the VM alpine03-demon-public, we see that the VM is routing even though I am addressing my physical router.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">alpine03-demon-public:~# traceroute 192.168.11.1
</span></span><span class="line"><span class="cl">traceroute to 192.168.11.1 (192.168.11.1), 30 hops max, 46 byte packets
</span></span><span class="line"><span class="cl"> 1  10.28.11.17 (10.28.11.17)  0.224 ms  0.168 ms  0.048 ms
</span></span><span class="line"><span class="cl"> 2  100.64.0.0 (100.64.0.0)  0.064 ms  0.079 ms  0.038 ms
</span></span><span class="line"><span class="cl"> 3  192.168.11.1 (192.168.11.1)  0.265 ms  0.168 ms  0.188 ms
</span></span></code></pre></div><p>In NSX, it looks like this:</p>
<figure><a href="06.png"><picture><source srcset="/vcf9-nsx-vpc-2/06_hu_59cd9ca74038f915.png" type="image/png">
          <img
            src="/vcf9-nsx-vpc-2/06_hu_59cd9ca74038f915.png"alt="VCP traceflow"width="1450"
            height="925"/>
        </picture></a><figcaption><p>NSX Traceflow (click to enlarge)</p></figcaption></figure>
<p>Maybe we should take a look at the trace flow from my router.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">[admin@ToR.lab.home] &gt; tool/traceroute 10.28.11.19
</span></span><span class="line"><span class="cl">ADDRESS                          LOSS SENT    LAST     AVG    BEST   WORST STD-DEV STATUS                                                       
</span></span><span class="line"><span class="cl">10.28.11.19                        0%    7   0.6ms     0.5     0.3     0.6     0.1                                                              
</span></span></code></pre></div><p>It appears unusual, as if the VM were directly in the VLAN. As a cross-check from my Mac computer, we also do not see 100.64.0.0 or 10.28.11.17 in the traceroute.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl"> danielkrieger@MBP-DKrieger  ~  traceroute 10.28.11.19
</span></span><span class="line"><span class="cl">traceroute to 10.28.11.19 (10.28.11.19), 64 hops max, 40 byte packets
</span></span><span class="line"><span class="cl"> 1  firewall.lab.home (192.168.10.1)  1.076 ms  0.331 ms  0.204 ms
</span></span><span class="line"><span class="cl"> 2  192.168.9.2 (192.168.9.2)  0.651 ms  0.357 ms  0.309 ms
</span></span><span class="line"><span class="cl"> 3  10.28.11.19 (10.28.11.19)  0.810 ms  1.100 ms  0.482 ms
</span></span></code></pre></div><p>Wait a minute. What is VMware doing?</p>
<p>From an external perspective, my Mikrotik router can address the VM directly, and I can also see the MAC address of the VM on the switch port of the ESX server in the router&rsquo;s ARP table.
From the outside, it actually looks as if the VM is in VLAN 1011 as normal. However, the VM would then be unable to communicate because if we take a closer look at the VM&rsquo;s IP address, we can see that it is in a subnet of 10.28.11.16/28 and its default gateway is not .1 (Mikrotik router) but .17.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">alpine03-demon-public:~# ip a
</span></span><span class="line"><span class="cl">2: eth0: &lt;BROADCAST,MULTICAST,UP,LOWER_UP&gt; mtu 1500 qdisc pfifo_fast state UP qlen 1000
</span></span><span class="line"><span class="cl">    link/ether 00:50:56:bf:5e:0c brd ff:ff:ff:ff:ff:ff
</span></span><span class="line"><span class="cl">    inet 10.28.11.19/28 scope global eth0
</span></span><span class="line"><span class="cl">       valid_lft forever preferred_lft forever
</span></span><span class="line"><span class="cl">    inet6 fe80::250:56ff:febf:5e0c/64 scope link 
</span></span><span class="line"><span class="cl">       valid_lft forever preferred_lft forever
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">alpine03-demon-public:~# route -e
</span></span><span class="line"><span class="cl">Kernel IP routing table
</span></span><span class="line"><span class="cl">Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface
</span></span><span class="line"><span class="cl">default         10.28.11.17     0.0.0.0         UG        0 0          0 eth0
</span></span><span class="line"><span class="cl">10.28.11.16     *               255.255.255.240 U         0 0          0 eth0
</span></span></code></pre></div><p>Another indication that the VM cannot be in a VLAN is the fact that I can route to private VPC networks without without an T0. The VM is therefore in an overlay network.</p>
<p>As we can see in the Traceflow Tool in NSX, the traffic is sent to the Transit Gateway and then directly to the VLAN. Since our transit gateway is distributed, all of this happens on the ESX server on which the VM is currently running.
This clarifies the outgoing traffic, but how does the response get back?</p>
<p>Well, I mentioned that my Mikrotik switch knows the MAC address of the VM on the switch port of the ESX server.
In addition, my Mikrotik does not know the subnets that NSX creates for the VPCs but the router has a /24 route for the complete Public network.</p>
<p>To ensure that data traffic arrives at the correct ESX, the distributed Transit Gateway running on the ESX hosting the VM responds to ARP requests for the VM, allowing my Mikrotik router to send the data traffic to the correct ESX server. The distributed Transit Gateway will never respond to ARP requests for VMs hosted on a different ESX.
The ESX server receives the traffic and sends it directly to the VM.</p>
<p>Insert the “Nice Meme” here.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">The NSX reference design guide for NSX 9 also describes this in great detail. However, I will try to show actual trace flows and screenshots here, rather than just pure theory.</div>
    </aside>

    <aside class="admonition attention">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" class="feather feather-link" width="24" height="24" viewBox="0 0 24 24"
      fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
      <path d="M10 13a5 5 0 0 0 7.54.54l3-3a5 5 0 0 0-7.07-7.07l-1.72 1.71"></path>
      <path d="M14 11a5 5 0 0 0-7.54-.54l-3 3a5 5 0 0 0 7.07 7.07l1.71-1.71"></path>
   </svg></div><b>Attention</b>
        </div>
        <div class="admonition-content"><p>When connecting to multiple VLANs, administrators can configure optional routes to external networks, with each route specifying the external gateway of the corresponding VLAN as the next hop destination.</p>
<p>In scenarios where multiple distributed Transit Gateways share the same VLAN, inter-DTGW communication occurs through hairpin routing via the external
gateway.</p>
</div>
    </aside>
<p>That was quite a lot, but we&rsquo;re still not quite done, because there&rsquo;s also a NAT column in my test table.</p>
<h2 id="external-ips-aka-reflexive-nat">External IPs aka reflexive NAT</h2>
<p>Similar to centralized deployment, it is possible to expose a VM from a private VPC network. However, since we do not have an edge, this is only possible via reflexive (stateless) NAT.
The procedure is the same: we go to the settings of the VPC demon under Additional Configurations and select External IPs.
In the dialog, we can only select VMs that have not yet been assigned an external IP and that have a connection in either a transit or private VPC network.</p>
<figure><a href="07.png"><picture><source srcset="/vcf9-nsx-vpc-2/07_hu_72cea96518f0dfcb.png" type="image/png">
          <img
            src="/vcf9-nsx-vpc-2/07_hu_72cea96518f0dfcb.png"alt="VCP external IPs"width="1158"
            height="724"/>
        </picture></a><figcaption><p>VPC-Demon External IPs(click to enlarge)</p></figcaption></figure>
<p>As we can see in the screenshot, an IP address from the public VPC network range has been assigned.
The VM alpine01-demon-private can be accessed directly via the IP 10.28.11.2 and can also communicate with the Internet via this address.
The complete communication is as follows:</p>
<table>
  <thead>
      <tr>
          <th>Test #</th>
          <th>Source</th>
          <th>Destination</th>
          <th>Connect</th>
          <th>NAT</th>
          <th>Dist. Routing</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>1</td>
          <td>demon-private</td>
          <td>Internet</td>
          <td>Yes</td>
          <td>Yes</td>
          <td>Yes</td>
      </tr>
      <tr>
          <td>2</td>
          <td>demon-private</td>
          <td>demon-transit</td>
          <td>Yes</td>
          <td>No</td>
          <td>Yes</td>
      </tr>
      <tr>
          <td>3</td>
          <td>demon-private</td>
          <td>demon-public</td>
          <td>Yes</td>
          <td>No</td>
          <td>Yes</td>
      </tr>
      <tr>
          <td>4</td>
          <td>demon-private</td>
          <td>core-private</td>
          <td>No</td>
          <td>-</td>
          <td>-</td>
      </tr>
      <tr>
          <td>5</td>
          <td>demon-private</td>
          <td>core-transit</td>
          <td>No</td>
          <td>-</td>
          <td>-</td>
      </tr>
      <tr>
          <td>6</td>
          <td>demon-private</td>
          <td>core-public</td>
          <td>Yes</td>
          <td>Yes</td>
          <td>Yes</td>
      </tr>
      <tr>
          <td>7</td>
          <td>External</td>
          <td>demon-private</td>
          <td>Yes</td>
          <td>Yes</td>
          <td>Yes</td>
      </tr>
      <tr>
          <td>8</td>
          <td>demon-private</td>
          <td>blue-private</td>
          <td>No</td>
          <td>-</td>
          <td>-</td>
      </tr>
      <tr>
          <td>9</td>
          <td>demon-private</td>
          <td>blue-transit</td>
          <td>No</td>
          <td>-</td>
          <td>-</td>
      </tr>
      <tr>
          <td>10</td>
          <td>demon-private</td>
          <td>blue-public</td>
          <td>Yes</td>
          <td>Yes</td>
          <td>No</td>
      </tr>
  </tbody>
</table>
<p>Unlike a VPC with Auto SNAT enabled, it is not possible to communicate from a private VPC network with static NAT to a transit network of another VPC.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">Reflexive NAT is not visible or configurable for the VPC via the NSX GUI. You can only set the external IP to be assigned automatically.</div>
    </aside>
<p>With <em><strong>get transport-node external-ip</strong></em>, you can view the external IPs implemented on the ESX host in nsxcli.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">vcf09-m01-esx01.lab.home&gt; get transport-node external-ip 
</span></span><span class="line"><span class="cl">Thu Jun 26 2025 UTC 17:08:36.740
</span></span><span class="line"><span class="cl">                                                  External IP Mapping Table                                                  
</span></span><span class="line"><span class="cl">-----------------------------------------------------------------------------------------------------------------------------
</span></span><span class="line"><span class="cl">       External IP         Internal IP Discovered       Internal IP Seen            VNI          MAC Address       Port ID   
</span></span><span class="line"><span class="cl">       10.28.11.2                192.168.5.3                 0.0.0.0               69634      00:50:56:bf:5d:b1    67108886
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">The NSX Reference Design Guide contains traffic flow diagrams for almost all of my tests, for anyone who would like to review everything in detail.
</span></span></code></pre></div><h2 id="conclusion">Conclusion</h2>
<p>This article shows how VPCs can be operated in NSX 9 in a distributed architecture without any edge clusters. Of particular interest is the functionality of the Distributed Transit Gateway, which transparently mediates between overlay and physical VLAN networks. Despite the absence of edges, many core functions such as routing, public access, and external connections remain fully usable via Reflexive NAT and distributed routing.</p>
<p>The setup allows virtual network functions to be deployed independently of the NSX Edges in a tenant. These could, for example, be provided by a tenant customer themselves. VPCs have become significantly more powerful and exciting than they were in NSX 4.X. This increases flexibility, but also, to be fair, increases complexity. When designing, you have to think more about traffic flow and when traffic is distributed and when it is not. The transit gateway is an exciting new feature and is already familiar from the public cloud sector.</p>
<p>I think in the next VPC article I will deal with the topic of security.</p>
]]></content>
		</item>
		
		<item>
			<title>NSX - IDFW &amp; DFW Troubleshooting</title>
			<link>https://sdn-warrior.org/posts/nsx-dfw-troubleshooting/</link>
			<pubDate>Tue, 24 Jun 2025 01:00:00 +0200</pubDate>
			
			<guid>https://sdn-warrior.org/posts/nsx-dfw-troubleshooting/</guid>
			<description><![CDATA[A brief guide to troubleshooting the IdFW and dFW]]></description>
			<content type="html"><![CDATA[<h2 id="introduction">Introduction</h2>
<p>Today, I would like to keep it brief and share a quick tip for all Identity (distributed) firewall administrators.
I&rsquo;m sure everyone has been in this situation: you&rsquo;ve written a nice new Identity firewall, but for some reason it doesn&rsquo;t work.
If it were a normal firewall rule in NSX, I would first check the Traceflow Tool, but unfortunately that doesn&rsquo;t work with the Identity firewall.</p>
<h2 id="what-is-the-identity-firewall">What is the Identity Firewall?</h2>
<p>An identity firewall does not make firewall decisions based on IP addresses, but uses user and group information to apply security policies.
It identifies users based on their login credentials and dynamically assigns them appropriate access rights and firewall rules.
This enables more granular control and improved protection, as access permissions are directly linked to the actual identity of the user.
If you want to know how I used this for one of my customers, you can read about it <a href="https://sdn-warrior.org/posts/nsx-idfw-vdi/">here</a>.</p>
<h2 id="how-do-i-now-fix-my-firewall-rule">How do I now fix my firewall rule?</h2>
<p>There may be several reasons why my Identity Firewall is not working. Here is a list of some of them:</p>
<ul>
<li>
<p>Group memberships in AD have been changed or new users have been created.</p>
<p>This is one of the simplest cases.
The default Delta Sync with AD is 180 minutes, and in larger environments, this sync interval may be higher for performance reasons.
There are two solutions here: either wait or force a Delta Sync under System - Identity Firewall AD.</p>
</li>
<li>
<p>The Identity Firewall is turned off.</p>
<p>This may sound a bit silly, but it has actually happened to me.
You can create Identity Firewall Rules and groups, push the policy, and then find that the rules do not work.
You don&rsquo;t get an error message or notification. If this is the case, the Identity Firewall may not be allowed for the cluster on which the source VMs are located. To do this, go to Security - Distributed Firewall and under Settings, you can enable or disable the Identity Firewall for each cluster.</p>
</li>
</ul>
<figure><a href="04.png"><picture><source srcset="/nsx-ts-idfw/04_hu_929d6e2a291f7a2d.png" type="image/png">
          <img
            src="/nsx-ts-idfw/04_hu_929d6e2a291f7a2d.png"alt="iDFW"width="1705"
            height="972"/>
        </picture></a><figcaption><p>iDFW Settings (click to enlarge)</p></figcaption></figure>
<ul>
<li>
<p>Guest Introspection not installed</p>
<p>If you want to use Guest Introspection, i.e. the version with VMware Tools, then these must also be installed on the source VM.
The standard VMware Tools installation does not install these. Either install VMware Tools full or select Guest Introspection explicitly.
The source of the error is of course eliminated if you do log scraping (which I personally do not prefer).</p>
</li>
</ul>
<h2 id="what-should-you-do-if-you-want-to-know-whether-rules-have-actually-been-implemented-for-a-vm">What should you do if you want to know whether rules have actually been implemented for a VM?</h2>
<p>As mentioned at the beginning, unfortunately you cannot use the Traceflow tool to check a firewall rule.
However, there are ways to see if there are Identity Firewall user sessions and which rules have been implemented.</p>
<figure><a href="01.png"><picture><source srcset="/nsx-ts-idfw/01_hu_ffb7396154069db3.png" type="image/png">
          <img
            src="/nsx-ts-idfw/01_hu_ffb7396154069db3.png"alt="iDFW Sessions"width="1190"
            height="995"/>
        </picture></a><figcaption><p>IDFW User Sessions (click to enlarge)</p></figcaption></figure>
<p>The easiest way to check the sessions is in the GUI under Security - Security Overview and then scroll all the way down to Identity Firewall User Sessions.
The VM is not displayed in plain text, but Universal Search in NSX resolves the VM.</p>

    <aside class="admonition tip">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-sun">
      <circle cx="12" cy="12" r="5"></circle>
      <line x1="12" y1="1" x2="12" y2="3"></line>
      <line x1="12" y1="21" x2="12" y2="23"></line>
      <line x1="4.22" y1="4.22" x2="5.64" y2="5.64"></line>
      <line x1="18.36" y1="18.36" x2="19.78" y2="19.78"></line>
      <line x1="1" y1="12" x2="3" y2="12"></line>
      <line x1="21" y1="12" x2="23" y2="12"></line>
      <line x1="4.22" y1="19.78" x2="5.64" y2="18.36"></line>
      <line x1="18.36" y1="5.64" x2="19.78" y2="4.22"></line>
   </svg></div><b>Practical tip</b>
        </div>
        <div class="admonition-content">Unfortunately, the NSX GUI is not always the fastest, and the display is not 100% reliable. If a user logs in and out again quickly, the GUI may not display anything. For this reason, IDFW rules can also be logged if you have a central syslog.</div>
    </aside>
<p>The other option is to view it via the CLI.</p>
<h2 id="cli---find-the-rules">CLI - Find the rules!</h2>
<p>To find out which firewall rules are actually implemented on a VM, we first need to log in to the ESX server on which the VM is currently running.
First, we use a simple CLI command to find the network card of our VM.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">summarize-dvfilter | grep -A16 WinClientA0001
</span></span></code></pre></div><p>The result should look like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">[root@esxnuc03:~] summarize-dvfilter | grep -A16 WinClientA0001
</span></span><span class="line"><span class="cl">world 1085633 vmm0:WinClientA0001 vcUuid:&#39;50 27 d1 31 1f 89 df b1-65 58 00 d0 ab 64 1f 91&#39;
</span></span><span class="line"><span class="cl"> port 67108888 WinClientA0001.eth0
</span></span><span class="line"><span class="cl">  vNic slot 2
</span></span><span class="line"><span class="cl">   name: nic-1085633-eth0-vmware-sfw.2
</span></span><span class="line"><span class="cl">   agentName: vmware-sfw
</span></span><span class="line"><span class="cl">   state: IOChain Attached
</span></span><span class="line"><span class="cl">   vmState: Attached
</span></span><span class="line"><span class="cl">   failurePolicy: failClosed
</span></span><span class="line"><span class="cl">   serviceVMID: 1
</span></span><span class="line"><span class="cl">   filter source: Dynamic Filter Creation
</span></span><span class="line"><span class="cl">   moduleName: nsxt-vsip-24765085
</span></span><span class="line"><span class="cl">[root@esxnuc03:~] 
</span></span></code></pre></div><p>The name of the NIC is important for our next step. In my example, it is <em><strong>nic-1085633-eth0-vmware-sfw.2</strong></em>.</p>
<p>Next, let&rsquo;s display all realized rules for the VM:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">vsipioctl getrules -f nic-1085633-eth0-vmware-sfw.2
</span></span></code></pre></div><div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">[root@esxnuc03:~] vsipioctl getrules -f nic-1085633-eth0-vmware-sfw.2
</span></span><span class="line"><span class="cl">ruleset mainrs {
</span></span><span class="line"><span class="cl">  # generation number: 0
</span></span><span class="line"><span class="cl">  # realization time : 2025-06-23T18:55:56
</span></span><span class="line"><span class="cl">  # FILTER (APP Category) rules
</span></span><span class="line"><span class="cl">  rule 16360 at 1 inout protocol any from any to any with extended src ba9e01bc-5779-4eb2-813e-4c5b8e3ff1bf drop;
</span></span><span class="line"><span class="cl">  rule 14312 at 2 inout protocol icmp from any to addrset a34212cb-acb2-49b3-b74c-7683c0345a19 accept;
</span></span><span class="line"><span class="cl">  rule 14312 at 3 inout protocol ipv6-icmp from any to addrset a34212cb-acb2-49b3-b74c-7683c0345a19 accept;
</span></span><span class="line"><span class="cl">  rule 14316 at 4 inout protocol any from addrset a34212cb-acb2-49b3-b74c-7683c0345a19 to any drop with log tag &#39;alpine-drop&#39;;
</span></span><span class="line"><span class="cl">  rule 2 at 5 inout protocol any from any to any accept with log tag &#39;debug&#39;;
</span></span><span class="line"><span class="cl">}
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">ruleset mainrs_L2 {
</span></span><span class="line"><span class="cl">  # generation number: 0
</span></span><span class="line"><span class="cl">  # realization time : 2025-06-23T18:55:56
</span></span><span class="line"><span class="cl">  # FILTER rules
</span></span><span class="line"><span class="cl">  rule 1 at 1 inout ethertype any stateless from any to any accept;
</span></span><span class="line"><span class="cl">}
</span></span></code></pre></div><figure><a href="05.png"><picture><source srcset="/nsx-ts-idfw/05_hu_60396c999947fc28.png" type="image/png">
          <img
            src="/nsx-ts-idfw/05_hu_60396c999947fc28.png"alt="Firewallrules"width="1450"
            height="789"/>
        </picture></a><figcaption><p>Firewall Rules (click to enlarge)</p></figcaption></figure>
<p>We can see in the output that firewall rule 16360 has been implemented on my test VM and is therefore active. This firewall rule is my identity firewall rule, and a user who is enabled for this rule is logged in.</p>
<p>We can also see that the VM has implemented two additional rules, namely 14312 and 14316. Although these rules have nothing to do with the VM, the VM is still assigned these rules because the Apply to field is set to dfw, meaning that all VMs receive these rules. I have described the importance of the Apply to field <a href="https://sdn-warrior.org/posts/nsx-apply-to/">here</a>.</p>
<p>Theoretically, you can also see this in the GUI, but with a delay, and in my experience it was more unreliable than reliable. Nevertheless, I want to show it. If you look at the firewall rule and then go to the group and look at the effective members, a VM should be implemented. As I said, the GUI is very slow here, and if I have problems with the IDFW, I prefer to rely on the CLI.</p>
<figure><a href="02.png"><picture><source srcset="/nsx-ts-idfw/02_hu_794d76f92e458fe7.png" type="image/png">
          <img
            src="/nsx-ts-idfw/02_hu_794d76f92e458fe7.png"alt="Effectiv Members"width="1188"
            height="970"/>
        </picture></a><figcaption><p>Effectiv Members (click to enlarge)</p></figcaption></figure>
<p>This method allows you to quickly check whether an IDFW / DFW rule has been realized on the VM. If, as in the example above, the IDFW is disabled for the cluster, I can see the FW rule in the GUI, but it is not displayed in the CLI.
The same would be the case if I had an error with the apply to field. The rule would not be realized on my VM and also not shown in the cli.</p>
<h2 id="summary">Summary</h2>
<p>Troubleshooting the IDFW can sometimes be a little difficult, but it&rsquo;s not impossible.
I hope this quick practical tip helps you. It has often helped me when the GUI has tricked me. In combination with the check mechanisms in the GUI (even if these are sometimes slow) and the logging the IDFW / DFW is easy to troubleshoot.</p>
]]></content>
		</item>
		
		<item>
			<title>Homelab V6 - It’s not just Taylor Swift who has different eras, my home lab does too</title>
			<link>https://sdn-warrior.org/posts/homelab-v6/</link>
			<pubDate>Fri, 20 Jun 2025 17:00:00 +0200</pubDate>
			
			<guid>https://sdn-warrior.org/posts/homelab-v6/</guid>
			<description><![CDATA[A quick update on what's been happening in my lab over the last few weeks.]]></description>
			<content type="html"><![CDATA[<h2 id="introduction">Introduction</h2>
<p>I&rsquo;ve had an exciting few weeks. I was part of the VCF9 beta and ran several VCF instances in parallel, which meant I kept running into performance bottlenecks.
On top of that, the Minisforum A2 was finally available. But first things first.</p>
<h2 id="25-gbs-lan---and-what-i-have-learned">2.5 Gb/s LAN - and what I have learned</h2>
<p>Well, where should I start? I bought the Mikrotik CRS326-4C+20G+2Q for my NUCs, which offered me 20x 2.5 GB/s ports and wasn&rsquo;t exactly cheap at around 800 euros.
In hindsight, I can say that it wasn&rsquo;t a good investment. Not that the switch is bad, it&rsquo;s not, but as of today, I only use exactly 2 2.5G ports.
For this purpose, I utilize all 10G and 40G ports (with brakeout) as 10G ports are now in short supply. Following the recent upgrade, I have exactly one 10G port remaining in the entire lab.
That&rsquo;s why I&rsquo;m actually considering selling the Switch again.
But hey, that&rsquo;s part of homelabbing. Sometimes you just make bad investments.</p>
<p>Another thing that bothers me about 2.5G is that I constantly have problems with autospeed. Sooner or later, I lose a network port when the switch or server is set to autospeed. This isn&rsquo;t just a problem with Mikrotik switches; I&rsquo;ve had the same issue with other manufacturers as well. It also affects my TP-Link access points.
As a workaround, I just set everything to fixed speed, then Wakeup on LAN doesn&rsquo;t work because the nucs can&rsquo;t handle 2.5 Gb in power save mode – there&rsquo;s always something.
That&rsquo;s why I decided three weeks ago to sell my beloved nuc cluster and say goodbye to the 2.5 GB LAN experiment. And because I spent a little too much money at Minisforum. But more on that later.</p>
<h2 id="post-nuc-era">Post NUC era</h2>
<p>It&rsquo;s not just Taylor Swift who has different eras, my home lab does too. Does that mean I&rsquo;ve now said goodbye to NUCs altogether?
A resounding yes and no. I won&rsquo;t be building any more NUC-based workloads, simply because although the ratio between CPU performance and RAM is good again with the 14th generation, the network is just too slow, especially with VCF9, vSAN ESA and other things I want to test.</p>
<p>However, the NUC 14 with Ultra 7 155H CPU, 16 threads, and a maximum of 128 GB RAM is the perfect management NUC, and since I got it used for around €310, it will be my last remaining NUC with ESX.
For management, the single 2.5 Gb/s adapter is sufficient, as it is not an H version (high) and therefore no second network card can be installed.</p>
<figure><picture><source srcset="/labv6/01_hu_40b00751ee2d3b48.jpg" type="image/jpeg">
          <img
            src="/labv6/01_hu_40b00751ee2d3b48.jpg"alt="Network bgp setup"width="2016"
            height="1512"/>
        </picture><figcaption><p>Dusty NUC 14 (click to enlarge)</p></figcaption></figure>
<p>Yes, I know that the NUC is dusty, but that&rsquo;s hard to avoid with an open rack.</p>
<h2 id="whats-new">What&rsquo;s new?</h2>
<p>Now for the obvious: I bought two MS-A2s from Minisforum. These form my new AMD cluster with 32 cores, 64 threads, and 2x 128 GB RAM. In addition, everything on the network side has now been converted to 2x 10 GB per ESXi server.</p>
<figure><picture><source srcset="/labv6/05_hu_aac61a3e1ccddf00.jpg" type="image/jpeg">
          <img
            src="/labv6/05_hu_aac61a3e1ccddf00.jpg"alt="MS-A2"width="993"
            height="433"/>
        </picture><figcaption><p>MS-A2 Powerhouse</p></figcaption></figure>
<p>Since the housing dimensions are completely identical to the MS-01, I was able to use the thingsINrack custom rackmount solution again.
Unfortunately, they have become significantly more expensive than last time.
You now have to pay around 90 euros for the 3D-printed bracket.
The front is molded plastic and therefore has a nice finish. Of all the solutions on the market, this is still the cheapest and most flexible solution.
I have had my MS-01 in a rackmount like this from the beginning and have not had any changes or problems with the rackmount so far (I am not being paid for this, I am just very happy with it).</p>
<p>In addition, two more MS-01s have been added.
I posted this on LinkedIn weeks ago, but never got around to documenting it here on the blog.
The four existing MS-01s have been upgraded to 96 GB of RAM. Now you might ask why we didn&rsquo;t go straight to 128 GB – well, what can I say?
Stupid decisions or bad timing. Shortly after I received the 96 GB RAM for all my MS-01s, the 128 GB RAM was announced.
But since everything was already installed, I didn&rsquo;t want to send the RAM back.</p>
<p>That&rsquo;s life as an early adopter sometimes.</p>
<h2 id="the-lab-is-finished-now--the-lab-is-finished-now">The lab is finished now! &hellip; The lab is finished now?</h2>
<p>Please insert the Star Wars Anakin and Padmé meme here. I would put it in the blog, but I&rsquo;m afraid Disney would sue me :D
At least one update is still pending. My management NUC urgently needs 128 GB of RAM, and I think it will get that next month.
Otherwise, I&rsquo;m still thinking about memory tiering on the AMD servers, as they have a lot of CPU power.
However, I would have to buy more NVMes for this, as only 2 TB per MS-A2 is currently installed and I don&rsquo;t have any fast NVMes left in stock.</p>
<p>Another consideration is to run MS-01 on Proxmox – has anyone here ever called Jehovah?
Don&rsquo;t panic, I&rsquo;m staying loyal to VMware. The idea is to use Proxmox and deploy a thick nested ESX VM on it.
Proxmox can do hyperthreading with E/P cores, which unfortunately neither ESXi8 nor ESXi9 can do at the moment.
But of course, ESXi is designed to run on enterprise server hardware, and there are currently no asymmetric cores available.
In any case, the MS-01 would then have 20 threads and I could deploy nested servers with 16 vCores.
But this is currently nothing more than a thought experiment.
I&rsquo;ve done this technically before, but without measuring performance.
I could well imagine that a nested ESXi would run better on a physical ESXi than on a Proxmox server. Maybe a future experiment.</p>
<h2 id="pictures-or-it-didnt-happen">Pictures or it didn&rsquo;t happen</h2>
<figure><a href="02.jpg"><picture><source srcset="/labv6/02_hu_59b9c9139cd70ae.jpg" type="image/jpeg">
          <img
            src="/labv6/02_hu_59b9c9139cd70ae.jpg"alt="Network"width="6577"
            height="4155"/>
        </picture></a><figcaption><p>Network setup (click to enlarge)</p></figcaption></figure>
<p>Here is my network diagram, as I have often been asked which tool I use to draw these beautiful diagrams – it is Excalidraw. The whole thing runs as a docker container on my Unraid server.</p>
<figure><a href="04.jpg"><picture><source srcset="/labv6/04_hu_b7c82d2dde7bee91.jpg" type="image/jpeg">
          <img
            src="/labv6/04_hu_b7c82d2dde7bee91.jpg"alt="capacity"width="1708"
            height="493"/>
        </picture></a><figcaption><p>Lab capacity (click to enlarge)</p></figcaption></figure>
<p>Here you can see the lab capacity, but you should treat the GHz capacity with caution, as it is not accurate due to the E/P cores. vSphere calculates the capacity based on the first core and the maximum non-boosted frequency of the core, and then multiplies this by the number of cores in the server. This is why the display is never accurate for consumer CPUs, at least for the newer Intel generation.</p>
<p>In total, my lab now has 132 physical cores, including E and P cores. However, this does not matter when creating a VM. If, as with VCF Automation, there is a minimum number of vCores (24 vCPUs), then at the end of the day, 24 vCPUs must be executable on an ESX server, regardless of whether they ultimately run on E/P cores. This also explains why I bought two AMD boxes. I want to test VCF9 with Automation.
I currently have 896 GB of RAM. I could upgrade the MS-01 to 128 GB, but that would cost another €1,800, which is too expensive for me at the moment – cost-benefit analysis and all that.</p>
<figure><a href="03.jpg"><picture><source srcset="/labv6/03_hu_578e33cfe9e77d07.jpg" type="image/jpeg">
          <img
            src="/labv6/03_hu_578e33cfe9e77d07.jpg"alt="the lab"width="2142"
            height="2856"/>
        </picture></a><figcaption><p>Lab v6 (click to enlarge)</p></figcaption></figure>
<p>Yes, there is another NUC extreme, and it will remain. However, that is my self-built NAS and does not run any VMware workloads. It is also connected with 2x10G.
Thus, it has never really been considered part of the NUC era.</p>
]]></content>
		</item>
		
		<item>
			<title>VCF 9 - NSX VPC Part 1 - centralized Transit Gateway</title>
			<link>https://sdn-warrior.org/posts/vcf9-nsx-vpc/</link>
			<pubDate>Tue, 17 Jun 2025 18:00:00 +0200</pubDate>
			
			<guid>https://sdn-warrior.org/posts/vcf9-nsx-vpc/</guid>
			<description><![CDATA[A short article about VPCs in NSX 9 and VCF 9.]]></description>
			<content type="html"><![CDATA[<h2 id="introduction">Introduction</h2>
<p>Where to start? The VPC feature has been available in NSX for a long time, but it has often been somewhat under the radar. In VCF 9, VPC is now more than present and small spoiler - I say rightly so!</p>
<p>But wait, from the top, what are VPCs anyway?
VMware says NSX Virtual Private Clouds is an abstraction layer that enables the creation of standalone virtual private cloud networks within an NSX project in order to use network and security services in a self-service usage model.</p>
<p>While in NSX 4.X VPCs were only visible in the GUI when an NSX project was created, VMware has now changed this completely. As soon as you open the NSX GUI, the VPCs tab immediately catches your eye.</p>
<figure><a href="01.png"><picture><source srcset="/vcf9-nsx-vpc/01_hu_f3d2bf50d936f009.png" type="image/png">
          <img
            src="/vcf9-nsx-vpc/01_hu_f3d2bf50d936f009.png"alt="NSX 9 VPC"width="1710"
            height="1092"/>
        </picture></a><figcaption><p>NSX 9 VPC (click to enlarge)</p></figcaption></figure>
<p>As you can see here in the screenshot, VPC is very present in NSX and VCF 9, but before we can talk about VPCs, we need to talk about a new gateway in NSX 9 - the Default Transit Gateway.</p>
<h2 id="default-transit-gateway">Default Transit Gateway</h2>
<p>What the hell is a Default Transit Gateway? Yes, that&rsquo;s what I asked myself when I first tried out the beta of VCF 9.
Whereas in NSX 4.X a VPC was attached directly to a T0 or a VRF, in VCF9 a Transit Gateway is connected upstream, or rather the default Transit Gateway. At present, you can only have one Transit Gateway per project.
What is new in VCF 9 or NSX 9 is that the default project can now also have one or more VPCs.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">I will not go into NSX projects in this article, otherwise it would go beyond the scope of this article.
However, you can imagine that a project can represent a department or customer, while a VPC can be a logically separate environment within the project.
For example, to realize test and staging environments within a project or to provide different development environments for different teams in the same project.</div>
    </aside>
<p>The default Transit Gateway is present from the start and cannot be deleted. The configuration options are also very limited.
I can specify the name, the External Connection and the VPC Transit Subnet. This network comes from the 100.64.x.x range and is normally assigned by NSX itself and does not need to be customized.
The HA mode is taken over by the T0 to which my Transit Gateway is connected or, more precisely, my VPC connection profile determines whether the HA mode is taken over by my T0 or not.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">If you want to use default outbound NAT in your VPC, then your T0 should be active/standby or active/active stateful. Otherwise only manual static NAT is possible.</div>
    </aside>
<p>However, this is only half the truth, because there is also the so-called distributed mode. However, this assumes that I have created my External Connection as a Distributed Connection and not as a Centralized Connection (with Edge Transportnodes). But that&rsquo;s another story and I will work through the Distributed Connection in a separate article.</p>
<h2 id="lets-get-started---network-connectivity">Let&rsquo;s get started - Network Connectivity</h2>
<p>First we need to configure our network connectivity. There are two ways to do this. 1. in the vCenter or 2. in the NSX Manager under System -&gt; Setup Network Connectivity. This point is new. In VCF9, edges are no longer deployed via the SDDC as in VCF 5.X, but via the network connectivity. Here I can also expand my Edge Cluster or deploy a new Edge Cluster.
Fortunately, the deployment process has been optimized and is less error-prone and also has a few quality of life improvements.</p>
<figure><a href="02.png"><picture><source srcset="/vcf9-nsx-vpc/02_hu_55ab8dd9130caed0.png" type="image/png">
          <img
            src="/vcf9-nsx-vpc/02_hu_55ab8dd9130caed0.png"alt="vCenter Network Connectivity"width="1720"
            height="771"/>
        </picture></a><figcaption><p>vCenter Network Connectivity (click to enlarge)</p></figcaption></figure>
<p>I now have to configure my VPC External IP Blocks in the Network Configuration. These IP blocks are then used for all public networks and must generally be routed by my network.
This also means that these networks must not overlap. The Private - Transit Gateway IP Blocks are new.
These are used so that the VPCs of the same project can communicate with each other.
Which networks are used in VPCs will be explained in the course of the article.
The Private - Transit Gateway IP Blocks do not have to be routed in the physical network. They are used exclusively for intra-VPC communication.</p>
<figure><a href="03.png"><picture><source srcset="/vcf9-nsx-vpc/03_hu_7f79112b87f1462c.png" type="image/png">
          <img
            src="/vcf9-nsx-vpc/03_hu_7f79112b87f1462c.png"alt="vCenter Network Connectivity"width="600"
            height="698"/>
        </picture></a><figcaption><p>vCenter Network Connectivity (click to enlarge)</p></figcaption></figure>
<p>A few things are now happening in the background. Firstly, my VPC connectivity profile is being configured by NSX.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">In this profile I could also deactivate the N-S services to switch my Transit Gateway to distributed.</div>
    </aside>
<p>In addition, an external connectivity profile is created under Networking -&gt; External Connections.
This determines which T0 router the Transit Gateway is connected to.
Finally, the External Connectivity Profile is assigned to my Transit Gateway.
Can also be found under Networking -&gt; Transit Gateway.
Now that we have taken care of the external connection, we can concentrate on creating the actual VPC.</p>
<h2 id="create-a-virtual-private-cloud">Create a Virtual Private Cloud</h2>
<p>We are now about to create our first VPC. In VCF 9, there are two options.
One is via vCenter and the other is directly via NSX. Since we have not yet configured a VPC service profile, we will create our first VPC via NSX. To do this, we go to Add VPC in the VPC menu and are greeted by this attractive dialog box.</p>
<figure><a href="04.png"><picture><source srcset="/vcf9-nsx-vpc/04_hu_2f2eaf86c79ef995.png" type="image/png">
          <img
            src="/vcf9-nsx-vpc/04_hu_2f2eaf86c79ef995.png"alt="New VPC"width="1709"
            height="1185"/>
        </picture></a><figcaption><p>New VCP in NSX (click to enlarge)</p></figcaption></figure>
<p>Here we assign a name to our VPC, select the previously created connectivity profile, and can select or create a service profile.
In the VPC service profile, we can activate a DHCP server and assign profile-specific DNS servers.
These must be accessible from the VPC. The same applies to the NTP server. We can also create subnet profiles.
The standard profiles that every normal NSX segment has are stored here. Normally, we don&rsquo;t need to worry about anything here.
In my VPC setup, I have activated a DHCP server in the service profile for my default project and specified my LAB DNS and global NTP server.</p>
<figure><a href="05.png"><picture><source srcset="/vcf9-nsx-vpc/05_hu_dfb80d5109eb12cf.png" type="image/png">
          <img
            src="/vcf9-nsx-vpc/05_hu_dfb80d5109eb12cf.png"alt="VPC Service Profile"width="1161"
            height="692"/>
        </picture></a><figcaption><p>VPC Service Profile (click to enlarge)</p></figcaption></figure>
<p>Next, I define a private VPC IP CIDR (maximum of 5 per VPC). In my lab, this is 192.168.5.0/24. I will explain what these are and what they do in the next section. I also assign a short, descriptive log identifier and assign the VPC name.</p>
<figure><a href="06.png"><picture><source srcset="/vcf9-nsx-vpc/06_hu_1dcfbe6488db0cf2.png" type="image/png">
          <img
            src="/vcf9-nsx-vpc/06_hu_1dcfbe6488db0cf2.png"alt="VPC Blue"width="1450"
            height="922"/>
        </picture></a><figcaption><p>VPC Blue (click to enlarge)</p></figcaption></figure>
<p>After saving and creating the VPC, the additional configurations must be carried out. Here, we can authorize users/groups for the VPC and assign different roles.
More importantly, however, we can create our actual subnets. I think now would be a good time to explain the different types of subnets.</p>
<h3 id="vpc-subnets">VPC Subnets</h3>
<p>In VCF 9, there are now three different types of VPC subnets. Actually, there are four, but the fourth is an AVI service subnet, which is actually only a automatic created private VPC subnet - so that doesn&rsquo;t really count.
This means that we actually have three subnet types.</p>
<ul>
<li>Private VPC</li>
</ul>
<p>This subnet is the network that is only routed within the VPC. That is why it is private – makes sense so far. This subnet cannot be accessed outside the VPC and the network is assigned from the private VPC IP CIDRs.</p>
<ul>
<li>Private - Transit Gateway</li>
</ul>
<p>This network is new in VCF9. Similar to the private network, the network is created from the defined private transit gateway IP blocks (VPC connectivity profile).
However, it is routed via the default transit gateway, which allows you to access workloads in these networks from all VPCs connected to the same transit gateway.
Since you can currently only have one transit gateway per NSX project, you should use these networks with caution.
The useful thing is that even though the networks are routed via the transit gateway, they are only accessible to VPCs.
VMs connected to a T1 or T0 router via segments cannot reach these networks because the T0 does not have routes for private networks.</p>
<ul>
<li>Public</li>
</ul>
<p>The name says it all. These are subnets created from the previously defined external IP blocks.
They must be routed, must not overlap, and are managed via the VPC Connectivity Profile, as with the Transit Gateway.
Workloads in public networks are generally accessible from anywhere.</p>
<figure><a href="07.png"><picture><source srcset="/vcf9-nsx-vpc/07_hu_ecd401f7813270d.png" type="image/png">
          <img
            src="/vcf9-nsx-vpc/07_hu_ecd401f7813270d.png"alt="VPC Subnets"width="1155"
            height="938"/>
        </picture></a><figcaption><p>VPC Subnets (click to enlarge)</p></figcaption></figure>
<h3 id="additional-configurations">Additional Configurations</h3>
<p>Now that we have clarified which subnets exist in VPC, we can move on to the final settings so that the VPC can be created.
I create one for each subnet and let it be assigned automatically.
Since Default Outbound NAT is specified in the VPC Connectivity Profile, NSX automatically creates an Outbound NAT Rule.
In addition, a group containing all VPC subnets is automatically created. This can be used in the distributed firewall.
We could also define additional firewall rules (requires a vDefend license – without a license, only stateless N/S rules can be defined).
I will explain the topic of network services later in this article.</p>
<figure><a href="09.png"><picture><source srcset="/vcf9-nsx-vpc/08_hu_dfaa8017ffb048ef.png" type="image/png">
          <img
            src="/vcf9-nsx-vpc/08_hu_dfaa8017ffb048ef.png"alt="Additional Configurations"width="1230"
            height="630"/>
        </picture></a><figcaption><p>Additional Configurations (click to enlarge)</p></figcaption></figure>
<p>For my test, I create another VPC. My finished topology now looks like this.
As always, I will use Alpine Linux VMs for testing.</p>
<figure><a href="09.png"><picture><source srcset="/vcf9-nsx-vpc/09_hu_e3ca4708d5c5c907.png" type="image/png">
          <img
            src="/vcf9-nsx-vpc/09_hu_e3ca4708d5c5c907.png"alt="VPC Topologiy"width="1370"
            height="1010"/>
        </picture></a><figcaption><p>VPC Topologiy (click to enlarge)</p></figcaption></figure>
<p>I created a second VPC via vCenter.
This option has been integrated into the network overview and is essentially the same as in NSX.
This creation option can also be controlled via permissions or even completely disabled.
There is only one difference compared to creation via NSX Manager.
In vCenter, you can currently only manage VPCs from the default project.
All VPCs from other projects are read-only. In addition, you cannot modify the profiles.
This is reserved for the NSX administrator.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">In my opinion, this is a good way to hand over the management of VPC networks to vSphere administrators or others.
No network knowledge is required to create a new subnet, as the profiles are predefined. Developers could, for example, create new networks in their development VPC themselves. No complicated automation is required, just a simple right-click in vCenter.</div>
    </aside>
<h2 id="initial-tests-with-the-vpc-blue">Initial tests with the VPC-Blue</h2>
<p>For my test, I created six VMs (see table) and assigned them to the corresponding networks of the VPCs in vCenter. The table shows the gateways automatically created by NSX.</p>
<table>
  <thead>
      <tr>
          <th>VM Name</th>
          <th>Network Type</th>
          <th>Gateway</th>
          <th>IP Adress</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>alpine01-blue-private</td>
          <td>private</td>
          <td>192.168.5.1/28</td>
          <td>192.168.5.3</td>
      </tr>
      <tr>
          <td>alpine02-blue-transit</td>
          <td>transit</td>
          <td>10.28.12.1/28</td>
          <td>10.28.12.3</td>
      </tr>
      <tr>
          <td>alpine03-blue-public</td>
          <td>public</td>
          <td>192.168.72.17/28</td>
          <td>192.168.72.19</td>
      </tr>
      <tr>
          <td>alpine04-red-private</td>
          <td>private</td>
          <td>192.168.5.1/28</td>
          <td>192.168.5.4</td>
      </tr>
      <tr>
          <td>alpine05-red-transit</td>
          <td>transit</td>
          <td>10.28.12.17/28</td>
          <td>10.28.12.19</td>
      </tr>
      <tr>
          <td>alpine06-red-public</td>
          <td>public</td>
          <td>192.168.72.33/28</td>
          <td>192.168.72.35</td>
      </tr>
  </tbody>
</table>
<p>As you can see, the public and transit networks do not overlap and are created consecutively.
Since I have enabled DHCP in every VPC subnet, the first and second IP addresses are occupied. The first is the gateway, the second is the DHCP server.</p>
<h2 id="test-scenarios">Test scenarios</h2>
<p>I will perform certain tests for each VM from VPC-BLUE.
First, I will test Internet connectivity, then intra-VPC connectivity, i.e., communication with all VMs in VPC-BLUE, and finally inter-VPC connectivity, i.e., communication with all VMs in VPC-RED.
Finally, I will perform a connectivity test from outside the NSX environment.
I will do all of this with simple ICMP tests. To do this, all firewalls on NSX and the VMs have been disabled.</p>
<h3 id="alpine01-blue-private">alpine01-blue-private</h3>
<h4 id="internet-connectivity">Internet connectivity</h4>
<p>In my first test, I would like to see where the <em><strong>alpine01-blue-private</strong></em> can communicate. The VM has been assigned the IP address 192.168.5.3 by the DHCP.
Since N-S Services and Default Outbound NAT are enabled in the VPC Connectivity Profile, alpine01-blue-private reaches the Internet and is nat-ed to the first public IP address of the public network IP for VPC SNAT.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">To make this work, NSX takes a subnet from the public network block in the background and reserves it for outbound NAT.
Each VPC gets its own outbound NAT IP.
VPC-Blue has been assigned the outbound NAT IP address 192.168.72.0/32, and VPC-RED has automatically been assigned the outbound NAT IP address 192.168.72.1/32.
The public subnets should therefore be sufficiently large, as a maximum of five subnets can be specified per profile.</div>
    </aside>
<figure><a href="10.png"><picture><source srcset="/vcf9-nsx-vpc/10_hu_4fdc10718f8ec09f.png" type="image/png">
          <img
            src="/vcf9-nsx-vpc/10_hu_4fdc10718f8ec09f.png"alt="VPC NAT"width="1152"
            height="375"/>
        </picture></a><figcaption><p>VPC-Blue Auto SNAT Rule (click to enlarge)</p></figcaption></figure>
<h4 id="intra-vpc-connectivity">Intra VPC connectivity</h4>
<p>Next, I test the connectivity to the alpine02-blue-transit vm. This test is also successful, as the transit VM can also be accessed via the VPC gateway.
Even though you might think that the SNAT rule applies here, it does not, because NAT takes place on the transit gateway and not on the VPC gateway. The traffic is also distributed routed between the two subnets.
This can also be seen very clearly in the NSX Traceflow. The same applies to the connectivity to the alpine03-vpc-blue-public VM.</p>
<h4 id="inter-vpc-connectivity">Inter VPC connectivity</h4>
<p>In this test, I first try to reach the VM alpine04-red-private. This is not possible because private networks are not routed via the Transit Gateway. In addition, the private networks overlap.
Next, I perform a connection test to alpine05-red-transit. This is successful, but the traffic is nated via SNAT on the transit gateway. This also means that the traffic is not distributed routed and must run via the edge VM.
The same applies to traffic to alpine06-red-public.
<figure><a href="11.png"><picture><source srcset="/vcf9-nsx-vpc/11_hu_91e4a5aae2009fe3.png" type="image/png">
          <img
            src="/vcf9-nsx-vpc/11_hu_91e4a5aae2009fe3.png"alt="TCPDUMP"width="1022"
            height="204"/>
        </picture></a><figcaption><p>TCPDUMP on alpine05-red-transit (click to enlarge)</p></figcaption></figure></p>
<h4 id="external-connectivity">External connectivity</h4>
<p>The VM cannot be accessed externally.
My core router also does not have a route for the 192.168.5.0/28 subnet.
The routes are not announced to the T0 and therefore cannot be accessed by my physical test client on which I am currently writing this blog.
To avoid writing a lengthy explanation, I will present the results in table form.</p>
<table>
  <thead>
      <tr>
          <th>Test #</th>
          <th>Source</th>
          <th>Destination</th>
          <th>Connect</th>
          <th>NAT</th>
          <th>Dist. Routing</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>1</td>
          <td>blue-private</td>
          <td>Internet</td>
          <td>Yes</td>
          <td>Yes</td>
          <td>No</td>
      </tr>
      <tr>
          <td>2</td>
          <td>blue-private</td>
          <td>blue-transit</td>
          <td>Yes</td>
          <td>No</td>
          <td>Yes</td>
      </tr>
      <tr>
          <td>3</td>
          <td>blue-private</td>
          <td>blue-public</td>
          <td>Yes</td>
          <td>No</td>
          <td>Yes</td>
      </tr>
      <tr>
          <td>4</td>
          <td>blue-private</td>
          <td>red-private</td>
          <td>No</td>
          <td>-</td>
          <td>-</td>
      </tr>
      <tr>
          <td>5</td>
          <td>blue-private</td>
          <td>red-transit</td>
          <td>Yes</td>
          <td>Yes</td>
          <td>No</td>
      </tr>
      <tr>
          <td>6</td>
          <td>blue-private</td>
          <td>red-public</td>
          <td>Yes</td>
          <td>Yes</td>
          <td>No</td>
      </tr>
      <tr>
          <td>7</td>
          <td>External</td>
          <td>blue-private</td>
          <td>No</td>
          <td>-</td>
          <td>-</td>
      </tr>
  </tbody>
</table>
<h3 id="alpine02-blue-transit">alpine02-blue-transit</h3>
<table>
  <thead>
      <tr>
          <th>Test #</th>
          <th>Source</th>
          <th>Destination</th>
          <th>Connect</th>
          <th>SNAT</th>
          <th>Dist. Routing</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>1</td>
          <td>blue-transit</td>
          <td>Internet</td>
          <td>Yes</td>
          <td>Yes</td>
          <td>No</td>
      </tr>
      <tr>
          <td>2</td>
          <td>blue-transit</td>
          <td>blue-private</td>
          <td>Yes</td>
          <td>No</td>
          <td>Yes</td>
      </tr>
      <tr>
          <td>3</td>
          <td>blue-transit</td>
          <td>blue-public</td>
          <td>Yes</td>
          <td>No</td>
          <td>Yes</td>
      </tr>
      <tr>
          <td>4</td>
          <td>blue-transit</td>
          <td>red-private</td>
          <td>No</td>
          <td>-</td>
          <td>-</td>
      </tr>
      <tr>
          <td>5</td>
          <td>blue-transit</td>
          <td>red-transit</td>
          <td>Yes</td>
          <td>No</td>
          <td>No</td>
      </tr>
      <tr>
          <td>6</td>
          <td>blue-transit</td>
          <td>red-public</td>
          <td>Yes</td>
          <td>Yes</td>
          <td>No</td>
      </tr>
      <tr>
          <td>7</td>
          <td>External</td>
          <td>blue-transit</td>
          <td>No</td>
          <td>-</td>
          <td>-</td>
      </tr>
  </tbody>
</table>
<h3 id="alpine03-blue-public">alpine03-blue-public</h3>
<table>
  <thead>
      <tr>
          <th>Test #</th>
          <th>Source</th>
          <th>Destination</th>
          <th>Connect</th>
          <th>NAT</th>
          <th>Dist. Routing</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>1</td>
          <td>blue-public</td>
          <td>Internet</td>
          <td>Yes</td>
          <td>Yes</td>
          <td>No</td>
      </tr>
      <tr>
          <td>2</td>
          <td>blue-public</td>
          <td>blue-private</td>
          <td>Yes</td>
          <td>No</td>
          <td>Yes</td>
      </tr>
      <tr>
          <td>3</td>
          <td>blue-public</td>
          <td>blue-transit</td>
          <td>Yes</td>
          <td>No</td>
          <td>Yes</td>
      </tr>
      <tr>
          <td>4</td>
          <td>blue-public</td>
          <td>red-private</td>
          <td>No</td>
          <td>-</td>
          <td>-</td>
      </tr>
      <tr>
          <td>5</td>
          <td>blue-public</td>
          <td>red-transit</td>
          <td>No</td>
          <td>-</td>
          <td>-</td>
      </tr>
      <tr>
          <td>6</td>
          <td>blue-public</td>
          <td>red-public</td>
          <td>Yes</td>
          <td>No</td>
          <td>No</td>
      </tr>
      <tr>
          <td>7</td>
          <td>External</td>
          <td>blue-public</td>
          <td>Yes</td>
          <td>-</td>
          <td>No</td>
      </tr>
  </tbody>
</table>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">Since N-S services are enabled in the VCP Connectivity Profile, traffic between VPCs cannot be distributed routed.
Disabling N-S services would allow traffic to be distributed routed, but NAT would then no longer be possible.
Test 5 is interesting because NAT is not used there, but since we have N-S services active, the traffic still has to pass through the Edge VM.</div>
    </aside>
<p>I think I was able to clearly explain how VPC and the various subnets work. It opens up exciting possibilities.
However,  I would like to discuss one more feature, namely the option of assigning an external IP address to a VM from a private VPC subnet. This can be done via NSX as well as via vCenter.</p>
<h2 id="assign-external-ip">Assign External IP</h2>
<p>With this feature, you can make a VM from a private VPC network available with just two clicks via vCenter.
You can think of it a bit like Elastic IP.</p>
<figure><a href="12.png"><picture><source srcset="/vcf9-nsx-vpc/12_hu_94e031b99c100a90.png" type="image/png">
          <img
            src="/vcf9-nsx-vpc/12_hu_94e031b99c100a90.png"alt="External IP"width="1012"
            height="826"/>
        </picture></a><figcaption><p>Assign External IP (click to enlarge)</p></figcaption></figure>
<p>All you have to do is select the VM&rsquo;s network adapter and confirm. The magic happens in the background. But let&rsquo;s take a closer look. An external IP was created in NSX under VPC / Network Services. If we take a closer look, we see that an IP address from the public subnet of VPC Blue was assigned.</p>
<figure><a href="13.png"><picture><source srcset="/vcf9-nsx-vpc/13_hu_ce8c4194dae4979.png" type="image/png">
          <img
            src="/vcf9-nsx-vpc/13_hu_ce8c4194dae4979.png"alt="External IP VCP-Blue"width="1159"
            height="731"/>
        </picture></a><figcaption><p>Assign External IP VPC-Blue (click to enlarge)</p></figcaption></figure>
<p>The vm alpine01-blue-private is now accessible from anywhere via the IP address 192.168.72.2.
Pretty cool. You can also cancel the assignment at any time, and the public IP will be returned to the IP address pool and can be used elsewhere.</p>
<p>This is implemented via a DNAT, which cannot be found in the NSX GUI. You can check this on the Active Edge by selecting the relevant interface with the command <em><strong>get firewall Interfaces</strong></em> and then displaying the NAT rules with <em><strong>get firewall &lt; uuid &gt; ruleset rules</strong></em>.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">vcf09-edge01&gt; get firewall 0bbcf4b5-1321-40b4-9325-f8023c06bdf2 ruleset rules 
</span></span><span class="line"><span class="cl">Tue Jun <span class="m">17</span> <span class="m">2025</span> UTC 16:10:06.440
</span></span><span class="line"><span class="cl">DNAT rule count: <span class="m">1</span>
</span></span><span class="line"><span class="cl">    Rule ID   : <span class="m">536875009</span>
</span></span><span class="line"><span class="cl">    Rule      : in inet protocol any postnat from any to addrset <span class="o">{</span>192.168.72.2<span class="o">}</span> dnat EIP table id: b1274d77-e09e-4cf5-8377-4501f54e296e
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">SNAT rule count: <span class="m">2</span>
</span></span><span class="line"><span class="cl">    Rule ID   : <span class="m">536875008</span>
</span></span><span class="line"><span class="cl">    Rule      : out inet protocol any prenat from addrset <span class="o">{</span>192.168.5.3<span class="o">}</span> to any snat EIP table id: b1274d77-e09e-4cf5-8377-4501f54e296e
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    Rule ID   : <span class="m">536873996</span>
</span></span><span class="line"><span class="cl">    Rule      : out protocol any prenat from ip 192.168.5.0/24 to any snat ip 192.168.72.1 port 37001-65535
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">Firewall rule count: <span class="m">0</span>
</span></span></code></pre></div><p>Here we see that a DNAT is being performed, making our VM accessible from outside the network. And the best part is that it&rsquo;s all completely simple and requires no network knowledge.</p>
<h2 id="conclusion">Conclusion</h2>
<p>VPCs are more prevalent than ever, and VMware is pursuing a clear path toward multi-tenancy in VCF9 and NSX9. Many familiar mechanisms have now been combined in NSX 9, and exciting developments are still on the way. In this article, I have only discussed centralized deployment. Distributed deployment is perhaps even more exciting, but it would have made this article too long. I also haven&rsquo;t talked about security or NSX projects, because both deployment options can be combined to bring significant added value to an on-premises cloud. I will therefore be publishing further articles on VCF 9, NSX 9, and VPCs in the near future. I look forward to many more exciting topics with VCF9. In my opinion, VCF 9 is a really big leap forward, and a lot has really happened in NSX 9 in particular. One thing is clear: the next few weeks and months will not be boring.</p>
<h3 id="one-more-thing">One more thing</h3>
<p>Perhaps a little fun fact. Anyone who has read my article on <a href="https://sdn-warrior.org/posts/ms-a2/">MS-A2</a> will know that I wrote that I run a VCF installation on it – well, now I can reveal that my VCF 9 lab is currently running nested on my single Minisforum MS-A2 server.</p>
]]></content>
		</item>
		
		<item>
			<title>Tales from the Lab - Minisforum MS-A2</title>
			<link>https://sdn-warrior.org/posts/ms-a2/</link>
			<pubDate>Sat, 14 Jun 2025 10:14:45 +0200</pubDate>
			
			<guid>https://sdn-warrior.org/posts/ms-a2/</guid>
			<description><![CDATA[A quick test of the new MS-A2 and how it compares to the MS-01.]]></description>
			<content type="html"><![CDATA[<h2 id="introduction">Introduction</h2>
<p>After being impressed with the performance and form factor of the Minisforum <strong>MS-01</strong>, I had to get my hands on their latest beast: the <strong>MS-A2</strong>, powered by AMD&rsquo;s latest Zen 5 architecture.
It&rsquo;s also exciting to test a gadget before <a href="https://williamlam.com/">William Lam</a>. Thanks for letting me be the first (I&rsquo;m just kidding, of course).</p>
<p>In this post, I’ll give you a quick rundown of both systems and how they compare—especially from a homelab perspective focused on virtualization, low noise, and energy-efficient high performance.</p>
<p>A quick disclaimer first: I am not a hardware tester and do not have any real measuring equipment. Everything I report here is based on my experience with both devices in my environment.</p>
<figure><a href="01.jpg"><picture><source srcset="/ms-a2/01_hu_6cc83136bb904261.jpg" type="image/jpeg">
          <img
            src="/ms-a2/01_hu_6cc83136bb904261.jpg"alt="MS-A2"width="2200"
            height="1650"/>
        </picture></a><figcaption><p>MS-A2 (click to enlarge)</p></figcaption></figure>
<h2 id="use-case">Use Case</h2>
<p>Perhaps a quick word about what I do with the machines (for anyone who doesn&rsquo;t read my blog regularly and stumbled across the article via Google).
The minicomputers primarily provide compute and storage resources for my labs. My labs are 99% nested, which means that I install another hypervisor on top of the hypervisor to then perform my actual tests. Nesting offers several advantages: I don&rsquo;t have to rebuild a test lab, and I can move a finished lab back and forth between hardware platforms, for example.
The VCF setup I used to perform the more practical tests allows me to change the hardware base relatively quickly.</p>
<p>But let&rsquo;s start with the less exciting stuff—tech specs.</p>
<h2 id="tech-specs-comparison">Tech Specs Comparison</h2>
<table>
  <thead>
      <tr>
          <th>Feature</th>
          <th>Minisforum MS-01</th>
          <th>Minisforum MS-A2</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><strong>CPU</strong></td>
          <td>Intel Core i9 13900H (14C/20T, 5.4 GHz Turbo)</td>
          <td>AMD Ryzen™ 9 9955HX (16C/32T, Zen 5, up to 5.4 GHz)</td>
      </tr>
      <tr>
          <td><strong>Architecture</strong></td>
          <td>Raptor Lake (14nm)</td>
          <td>Zen 5 (4nm)</td>
      </tr>
      <tr>
          <td><strong>TDP</strong></td>
          <td>45W</td>
          <td>55W</td>
      </tr>
      <tr>
          <td><strong>RAM Support</strong></td>
          <td>2x DDR5 SO-DIMM (up to 64 GB)</td>
          <td>2x DDR5 SO-DIMM (up to 96 GB)</td>
      </tr>
      <tr>
          <td><strong>Storage</strong></td>
          <td>2x M.2 NVMe (PCIe 3.0), 1x M.2 NVMe (PCIe 4.0)</td>
          <td>3x M.2 NVMe (PCIe 3.0)</td>
      </tr>
      <tr>
          <td><strong>Network</strong></td>
          <td>2x 2.5G Intel i226 NICs / 2x 10G Intel X710</td>
          <td>1x 2.5G Intel i226 NIC  / 1x 2.5G Realtek NIC / 2x 10G Intel X710</td>
      </tr>
      <tr>
          <td><strong>Graphics</strong></td>
          <td>Intel® Iris® Xe-Grafik</td>
          <td>Radeon 610M</td>
      </tr>
      <tr>
          <td><strong>Cooling</strong></td>
          <td>Dual-fan active cooling (customizable)</td>
          <td>Dual-fan active cooling (improved thermal design)</td>
      </tr>
      <tr>
          <td><strong>Size</strong></td>
          <td>154 x 150 x 64 mm</td>
          <td>154 x 150 x 64 mm (identical chassis)</td>
      </tr>
      <tr>
          <td><strong>Power Supply</strong></td>
          <td>External 19V DC</td>
          <td>External 19V DC</td>
      </tr>
      <tr>
          <td><strong>TPM</strong></td>
          <td>Yes (firmware TPM)</td>
          <td>Yes (firmware TPM)</td>
      </tr>
      <tr>
          <td><strong>vPro/SEV Support</strong></td>
          <td>vPro (limited)</td>
          <td></td>
      </tr>
  </tbody>
</table>
<p>At first glance, the two devices have a lot in common. Fortunately, the form factor is the same, so I don&rsquo;t have to buy a new rack mount. The exciting details are in the small print. The MS-01 only officially supports 64 GB RAM, while the A2 only supports up to 96 GB RAM.
Nevertheless, I have installed 128 GB of RAM in both systems. I am using the Crucial DDR5 RAM 128GB Kit (2x64GB 5600MHz SODIMM).</p>
<figure><a href="02.jpg"><picture><source srcset="/ms-a2/02_hu_48b66bf40d13bfa1.jpg" type="image/jpeg">
          <img
            src="/ms-a2/02_hu_48b66bf40d13bfa1.jpg"alt="RAM"width="1621"
            height="1216"/>
        </picture></a><figcaption><p>MS-A2 - RAM (click to enlarge)</p></figcaption></figure>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">I tested both devices with 128 GB RAM and they work fine for me. However, I must point out that there is no guarantee that it will work.</div>
    </aside>
<p>The second difference is the CPU architecture. While Intel relies on its Big/Little architecture, which gives us performance and efficiency cores, AMD Zen 5 has symmetrical cores and uses SMT, which is equivalent to Intel&rsquo;s Hyperthreading.
I have described how to make P and E cores usable with ESXi <a href="https://sdn-warrior.org/posts/nuc/#using-pe-cores">here</a>.</p>
<p>In my opinion, this is already a plus point for the MS-A2, as I can install ESXi just like that.
What I find a little strange is the network card selection. While the MS-01 has two 2.5G cards from Intel, the MS-A2 comes with a strange configuration of one 2.5G Intel card and one 2.5G Realtek card. Unfortunately, there is no Realtek driver for built-in cards in ESXi. Fun fact: Realtek USB network cards do work, though. Overall, this isn&rsquo;t important to me, as I use the X710 Intel cards throughout, since my network is completely 10G.</p>
<p>In summary, the following key data can be noted:</p>
<ul>
<li>MS-01 14 Cores / 128 GB RAM / 2x10G / VMware ESXi, 8.0.3, 24674464</li>
<li>MS-A2 32 Cores / 128 GB RAM / 2x10G / VMware ESXi, 8.0.3, 24674464</li>
</ul>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">Since I use my iSCSI as storage for both systems, I don&rsquo;t need to compare storage. However, since both servers are equipped with a 2TB Samsung 990 Pro in the PCIe 4 slot, the values should be identical.</div>
    </aside>
<figure><a href="03.jpg"><picture><source srcset="/ms-a2/03_hu_887134cc52d151bc.jpg" type="image/jpeg">
          <img
            src="/ms-a2/03_hu_887134cc52d151bc.jpg"alt="RAM"width="1815"
            height="1361"/>
        </picture></a><figcaption><p>MS-A2 - Inside (click to enlarge)</p></figcaption></figure>
<h2 id="cinebench-multicore-test--ms-01-vs-ms-a2">Cinebench Multicore Test – MS-01 vs MS-A2</h2>
<p>As a first pragmatic test, I created a Windows 11 VM on both systems. Each VM was assigned the <strong>maximum number of vCPUs</strong> available on the host—meaning:</p>
<ul>
<li><strong>MS-01</strong>: 14 vCPUs (max useeable cores on i9 13900H)</li>
<li><strong>MS-A2</strong>: 32 vCPUs (max useeable cores on Ryzen 9 9955HX)</li>
</ul>
<p>Both ESXi server had the <strong>power policy set to High Performance</strong>, and Cinebench 2024.01 was executed in a continuous loop for 30 minutes to simulate thermal and sustained performance under load.</p>
<figure><a href="04.png"><picture><source srcset="/ms-a2/04_hu_be80f4599837500a.png" type="image/png">
          <img
            src="/ms-a2/04_hu_be80f4599837500a.png"alt="Cinebench"width="2189"
            height="781"/>
        </picture></a><figcaption><p>Cinebench on MS-A2 (click to enlarge)</p></figcaption></figure>
<p>This may not be a perfect test, but since the MS-01 is already in lab operation, I didn&rsquo;t want to install Windows 11 natively. Of course, the Windows 11 test VM ran exclusively on the two hosts so as not to distort the results.</p>
<table>
  <thead>
      <tr>
          <th>Metric</th>
          <th>MS-01</th>
          <th>MS-A2</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><strong>Power Policy</strong></td>
          <td>High Performance</td>
          <td>High Performance</td>
      </tr>
      <tr>
          <td><strong>Idle Power Consumption</strong></td>
          <td>40 Watts</td>
          <td>36 Watts</td>
      </tr>
      <tr>
          <td><strong>Cinebench Power Draw</strong></td>
          <td>86 Watts</td>
          <td>121 Watts</td>
      </tr>
      <tr>
          <td><strong>Cinebench Score</strong></td>
          <td>734 pts (14 Cores)</td>
          <td>1670 pts (32 Cores)</td>
      </tr>
  </tbody>
</table>
<p>The difference is substantial. Even though the MS-A2 pulls more power under load, it <strong>doubles the Cinebench score</strong> compared to the MS-01. Idle consumption is slightly better on the MS-A2, likely due to the newer and more efficient AMD Zen 5 architecture. Performance-per-watt in multicore scenarios clearly favors the MS-A2.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">The power consumption was measured using an Aquara Smart plug. This is therefore not a scientifically accurate result. ;) So don&rsquo;t sue me if your electricity bills skyrocket.</div>
    </aside>
<p>Minisforum mentions more efficient cooling on its website, which may well be true, but the MS-A2 is somewhat louder and, above all, significantly warmer than the MS-01, which seems logical considering the power consumption.
However, both are still within acceptable limits. I wouldn&rsquo;t want to have the MS-A2 running 24/7 at full load next to my pillow, but anyone who knows my blog knows that I shut down devices I don&rsquo;t need anyway.
So my colleagues on the teams call haven&rsquo;t complained about a tornado raging next to me.</p>
<p>All in all, I would say that the noise is justified for the performance. When idle, however, the MS-A2 is inaudible.
My Mikrotik switch is louder, as is my Dyson fan, which is currently running thanks to the 28 degrees Celsius temperature.</p>
<h2 id="internal-network-performance--vm-to-vm-on-the-same-host">Internal Network Performance – VM to VM on the Same Host</h2>
<p>The external network performance was identical on both servers, which is not surprising since both have the same network cards and both CPUs have enough power to run a 10 Gb/s network.
That&rsquo;s why I&rsquo;m not going to bother with detailed tests at this point.</p>
<p>What interests me much more is how the performance compares between two VMs, as I do a lot of nested labs and also vSAN nested, which is often a decisive factor.</p>
<h3 id="test-setup">Test Setup</h3>
<ul>
<li><strong>2x Alpine Linux VMs</strong> (2 vCPU, 2 GB RAM each)</li>
<li><code>alpine-a</code>: runs <code>iperf3 -s</code></li>
<li><code>alpine-b</code>: runs <code>iperf3 -c 192.168.16.11</code></li>
<li>No optimizations applied (default config, no tuning)</li>
<li>Only these two VMs were running on the host</li>
</ul>
<h3 id="ms-01">MS-01</h3>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">alpine-b:~# iperf3 -c 192.168.16.11
</span></span><span class="line"><span class="cl">Connecting to host 192.168.16.11, port <span class="m">5201</span>
</span></span><span class="line"><span class="cl"><span class="o">[</span>  5<span class="o">]</span> <span class="nb">local</span> 192.168.16.12 port <span class="m">55930</span> connected to 192.168.16.11 port <span class="m">5201</span>
</span></span><span class="line"><span class="cl"><span class="o">[</span> ID<span class="o">]</span> Interval           Transfer     Bitrate         Retr  Cwnd
</span></span><span class="line"><span class="cl"><span class="o">[</span>  5<span class="o">]</span>   0.00-1.00   sec  5.03 GBytes  43.2 Gbits/sec    <span class="m">0</span>   2.80 MBytes
</span></span><span class="line"><span class="cl"><span class="o">[</span>  5<span class="o">]</span>   1.00-2.00   sec  4.33 GBytes  37.2 Gbits/sec   <span class="m">60</span>   2.75 MBytes
</span></span><span class="line"><span class="cl"><span class="o">[</span>  5<span class="o">]</span>   2.00-3.00   sec  4.96 GBytes  42.6 Gbits/sec    <span class="m">0</span>   2.75 MBytes
</span></span><span class="line"><span class="cl"><span class="o">[</span>  5<span class="o">]</span>   3.00-4.00   sec  5.88 GBytes  50.5 Gbits/sec    <span class="m">0</span>   3.01 MBytes
</span></span><span class="line"><span class="cl"><span class="o">[</span>  5<span class="o">]</span>   4.00-5.00   sec  6.93 GBytes  59.6 Gbits/sec    <span class="m">0</span>   3.01 MBytes
</span></span><span class="line"><span class="cl"><span class="o">[</span>  5<span class="o">]</span>   5.00-6.00   sec  6.99 GBytes  60.1 Gbits/sec    <span class="m">0</span>   3.24 MBytes
</span></span><span class="line"><span class="cl"><span class="o">[</span>  5<span class="o">]</span>   6.00-7.00   sec  6.54 GBytes  56.2 Gbits/sec    <span class="m">0</span>   3.24 MBytes
</span></span><span class="line"><span class="cl"><span class="o">[</span>  5<span class="o">]</span>   7.00-8.00   sec  6.32 GBytes  54.3 Gbits/sec    <span class="m">0</span>   3.24 MBytes
</span></span><span class="line"><span class="cl"><span class="o">[</span>  5<span class="o">]</span>   8.00-9.00   sec  6.62 GBytes  56.9 Gbits/sec  <span class="m">139</span>   2.59 MBytes
</span></span><span class="line"><span class="cl"><span class="o">[</span>  5<span class="o">]</span>   9.00-10.00  sec  6.37 GBytes  54.8 Gbits/sec    <span class="m">0</span>   2.78 MBytes
</span></span><span class="line"><span class="cl">- - - - - - - - - - - - - - - - - - - - - - - - -
</span></span><span class="line"><span class="cl"><span class="o">[</span> ID<span class="o">]</span> Interval           Transfer     Bitrate         Retr
</span></span><span class="line"><span class="cl"><span class="o">[</span>  5<span class="o">]</span>   0.00-10.00  sec  60.0 GBytes  51.5 Gbits/sec  <span class="m">199</span>
</span></span><span class="line"><span class="cl"><span class="o">[</span>  5<span class="o">]</span>   0.00-10.01  sec  60.0 GBytes  51.5 Gbits/sec           receiver
</span></span></code></pre></div><h3 id="ms-a2">MS-A2</h3>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">alpine-b:~# iperf3 -c 192.168.16.11
</span></span><span class="line"><span class="cl">Connecting to host 192.168.16.11, port <span class="m">5201</span>
</span></span><span class="line"><span class="cl"><span class="o">[</span>  5<span class="o">]</span> <span class="nb">local</span> 192.168.16.12 port <span class="m">35168</span> connected to 192.168.16.11 port <span class="m">5201</span>
</span></span><span class="line"><span class="cl"><span class="o">[</span> ID<span class="o">]</span> Interval           Transfer     Bitrate         Retr  Cwnd
</span></span><span class="line"><span class="cl"><span class="o">[</span>  5<span class="o">]</span>   0.00-1.00   sec  9.47 GBytes  81.3 Gbits/sec    <span class="m">0</span>   2.06 MBytes
</span></span><span class="line"><span class="cl"><span class="o">[</span>  5<span class="o">]</span>   1.00-2.00   sec  10.0 GBytes  85.8 Gbits/sec    <span class="m">0</span>   3.58 MBytes
</span></span><span class="line"><span class="cl"><span class="o">[</span>  5<span class="o">]</span>   2.00-3.00   sec  10.2 GBytes  87.4 Gbits/sec    <span class="m">0</span>   3.78 MBytes
</span></span><span class="line"><span class="cl"><span class="o">[</span>  5<span class="o">]</span>   3.00-4.00   sec  10.1 GBytes  86.7 Gbits/sec    <span class="m">0</span>   3.78 MBytes
</span></span><span class="line"><span class="cl"><span class="o">[</span>  5<span class="o">]</span>   4.00-5.00   sec  10.1 GBytes  87.1 Gbits/sec    <span class="m">0</span>   3.78 MBytes
</span></span><span class="line"><span class="cl"><span class="o">[</span>  5<span class="o">]</span>   5.00-6.00   sec  10.1 GBytes  87.0 Gbits/sec    <span class="m">0</span>   3.78 MBytes
</span></span><span class="line"><span class="cl"><span class="o">[</span>  5<span class="o">]</span>   6.00-7.00   sec  10.3 GBytes  88.6 Gbits/sec    <span class="m">0</span>   3.97 MBytes
</span></span><span class="line"><span class="cl"><span class="o">[</span>  5<span class="o">]</span>   7.00-8.00   sec  10.3 GBytes  88.7 Gbits/sec    <span class="m">0</span>   3.97 MBytes
</span></span><span class="line"><span class="cl"><span class="o">[</span>  5<span class="o">]</span>   8.00-9.00   sec  10.1 GBytes  87.2 Gbits/sec    <span class="m">0</span>   3.97 MBytes
</span></span><span class="line"><span class="cl"><span class="o">[</span>  5<span class="o">]</span>   9.00-10.00  sec  10.1 GBytes  86.9 Gbits/sec    <span class="m">0</span>   3.97 MBytes
</span></span><span class="line"><span class="cl">- - - - - - - - - - - - - - - - - - - - - - - - -
</span></span><span class="line"><span class="cl"><span class="o">[</span> ID<span class="o">]</span> Interval           Transfer     Bitrate         Retr
</span></span><span class="line"><span class="cl"><span class="o">[</span>  5<span class="o">]</span>   0.00-10.00  sec   <span class="m">101</span> GBytes  86.7 Gbits/sec    <span class="m">0</span>
</span></span><span class="line"><span class="cl"><span class="o">[</span>  5<span class="o">]</span>   0.00-10.03  sec   <span class="m">101</span> GBytes  86.4 Gbits/sec           receiver
</span></span></code></pre></div><h3 id="an-attempt-to-interpret-the-results">An attempt to interpret the results</h3>
<p>The difference is likely due to the internal memory bandwidth, PCIe performance, and possibly better NUMA handling on the MS-A2’s Zen 5 platform.
Since this is purely intra-host, the NICs themselves aren’t part of the data path, which makes this a good synthetic benchmark for VM-to-VM memory and CPU path efficiency.
The MS-A2 has a clear lead here—offering nearly 70% higher throughput compared to the MS-01.
In addition, the results on the MS-01 fluctuate, which in my opinion can happen when a vCPU is moved to an E Core. Since AMD does not have E Cores, the result is significantly more stable and subject to less fluctuation.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">I am not an expert in CPU architecture, so I can only speculate. So take the statements made with a grain of salt.
However, this does not change the fact that my tests, after repeated trials, have always reliably come out in favor of the MS-A2.</div>
    </aside>
<h2 id="noise---methodologically-incorrect-investigation">Noise - Methodologically incorrect investigation</h2>
<p>I measured the noise development of the MS-A2 “professionally” with my iPhone. There are 100 and 1 apps for this, but I chose the one with the most reviews (Dezibel X – no advertising for the app) and I can&rsquo;t calibrate it. If anyone wants to give me a dB meter, I&rsquo;ll do more professional tests in the future. :D</p>
<p>However, I don&rsquo;t have the time to compare it with the MS-01 - I tested this subsequently at the request of a LinkedIn user. I measured from a distance of 2 meters, as I normally sit between 2 and 3 meters away from my lab.
I get a base load of 32-34 dB. However, it should be noted that it is quite warm in my office today and my Mikrotik switch is generally a little louder than usual, but the point is to measure the relative additional load.
The background noise in my homeoffice ranges from quiet whispering to the gentle rustling of leaves in the wind. I live in a relatively rural area, so the loudest sounds are my neighbors or the owl outside my window.</p>
<p>In idle mode, the MS-A2 is completely inaudible in my environment. I haven&rsquo;t noticed any real change in the dB meter. The fan is drowned out by the rustling of the Mikrotik router – I really should replace the fans in that box.
I started my VCF Lab on the MS-A2 and waited about 10 minutes. The noise level rises to 35-36 dB but regularly reaches up to 42,3-43 dB. I have the fan curve set to performance, which means that the MS-A2 spins up quickly and then calms down again quickly, which results in a wave-like background noise with the noise constantly rising and falling.
I need to see if I can set the fan curve and response behavior in the same way as on an ASUS NUC. The NUC has an incredible number of options for fan management.</p>
<figure><picture><source srcset="/ms-a2/08_hu_550da8e561c48231.jpg" type="image/jpeg">
          <img
            src="/ms-a2/08_hu_550da8e561c48231.jpg"alt="A2 noise"width="589"
            height="1042"/>
        </picture><figcaption><p>MS-A2 noise level</p></figcaption></figure>
<p>The blue curve is the average, and you can see a clear increase in the high frequencies, which are the frequencies that most people find disturbing.
To put this into context, 42 dB is still considered relatively quiet and may not even be noticeable in a city apartment.
However, a difference of up to 10 dB is also significant. Humans perceive this as approximately double the noise level.</p>
<h2 id="but-can-it-run-doom-vcf">But can it run <del>doom</del> VCF?</h2>
<p>First of all, I did not install VCF on the MS-A2. Why? Time!
But I did migrate an existing nested lab to the MS-A2. Remember? Introduction and all that? That&rsquo;s exactly why I build my labs nested.
Here&rsquo;s a brief overview of how my VCF lab is set up.</p>
<p>It&rsquo;s a consolidated design and doesn&rsquo;t use vSAN. It&rsquo;s also built on three nested ESXi servers.
This isn&rsquo;t officially supported with VCF 5.X, but to be honest, I&rsquo;m constantly doing things that aren&rsquo;t supported, and that&rsquo;s not going to bother us here.
I deployed the original lab on 2 MS-01 and have been using it for quite some time.
The 3 nested ESXi hosts each have 8 vCPUs, 45 GB RAM, and a small boot disk.
The principal storage is NFS. If you&rsquo;re wondering how that works, I wrote a blog post about it <a href="https://sdn-warrior.org/posts/vcf-import-cluster/">here</a>.</p>
<p>If you&rsquo;re good at math, you&rsquo;ll have noticed that 3x 45 GB RAM is more than 128 GB RAM, but a little oversubscription never hurt anyone (except when it comes to licenses).</p>
<p>After everything had been migrated offline, since Intel and AMD are not compatible at all and therefore no live migration between the two systems is possible, I started VCF and the poor MS-A2 immediately went into full load.</p>
<figure><picture><source srcset="/ms-a2/05_hu_fd3a693a75c2dbc1.png" type="image/png">
          <img
            src="/ms-a2/05_hu_fd3a693a75c2dbc1.png"alt="A2 full load"width="1098"
            height="384"/>
        </picture><figcaption><p>MS-A2 on full load</p></figcaption></figure>
<p>Poor sweet summer child - the MS-A2 becomes slightly louder and the temperatures rise.</p>
<ul>
<li>I received the first sign of life after approx. 3:30 minutes. My vCenter responded to the first ping.
Of course, the ping times were far from ideal.</li>
<li>After 6:50 minutes i can logon to the vCenter.
Only NSX Manager is taking its time. But we are already familiar with that.</li>
<li>After a good 8 minutes, NSX Manager is also available, but still a little sluggish.</li>
<li>After 10 minutes, everything has settled down enough that you can use the lab really well.</li>
</ul>
<p>I don&rsquo;t notice any difference in responsiveness compared to when it&rsquo;s running on 2 MS-01.</p>
<h2 id="wheres-the-poop-robin">Where&rsquo;s the poop, Robin?</h2>
<p>I&rsquo;m not sure which of my readers are familiar with the series, but never mind. AMD and NSX is one of those things. Officially, NSX does not work on Ryzen or other AMD consumer CPUs.</p>
<p>At first, I was a little surprised, because I knew that NSX Edges cause problems during installation or upgrade on AMD consumer CPUs, but I deployed my Edges on Intel. A quick look at the console of one of my Edge VMs revealed the following.</p>
<figure><picture><source srcset="/ms-a2/06_hu_a5f1805ef7776353.png" type="image/png">
          <img
            src="/ms-a2/06_hu_a5f1805ef7776353.png"alt="Edge VM"width="1490"
            height="1021"/>
        </picture><figcaption><p>Stress on the EdgeVM</p></figcaption></figure>
<p>So, at first, I was pretty clueless. Luckily, there&rsquo;s the awesome vExpert community, because a) I&rsquo;m getting older and forgetting more than I want to admit, and b) I don&rsquo;t always know everything.
My thanks go to <a href="https://www.linkedin.com/in/abbed-sedkaoui-678718151/">Abbed Sedkaoui</a> for his quick help. He put me on the right track. By the way, he has a great <a href="https://strivevirtually.net/">blog</a>.</p>
<p>The problem is that when the Edge VM is started, a Python script is executed that checks whether an AMD CPU is installed and, if it is not an AMD EPYC, simply throws an error and prevents DPDK from starting. This leaves Edge offline.
<figure><picture><source srcset="/ms-a2/07_hu_31702297df6582ba.jpg" type="image/jpeg">
          <img
            src="/ms-a2/07_hu_31702297df6582ba.jpg"alt="Edge VM Error"width="1496"
            height="617"/>
        </picture><figcaption><p>EdgeVM - Here&rsquo;s the problem</p></figcaption></figure></p>
<p>The solution is simple: log in to EdgeVM with root and edit the following script /opt/vmware/nsx-edge/bin/config.py.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl"># AMD only claims DPDK support of the following processors:
</span></span><span class="line"><span class="cl"># (1) AMD EPYC 7XX1 series and newer.
</span></span><span class="line"><span class="cl"># (2) AMD EPYC 3000 Embedded Family and newer.
</span></span><span class="line"><span class="cl">vendor_info = &#34;&#39;
</span></span><span class="line"><span class="cl">model_name = &#34;&#39;
</span></span><span class="line"><span class="cl">for line in cpuinfo.splitlines():
</span></span><span class="line"><span class="cl">    if line startswith( &#39;vendor_id&#39;):
</span></span><span class="line"><span class="cl">        vendor_info = line.split(&#39;:&#39;1) [1].strip()
</span></span><span class="line"><span class="cl">    if line startswith &#39;model name&#39;):
</span></span><span class="line"><span class="cl">        model_name = line-split(&#39;:&#39;, 1) [1] .strip()
</span></span><span class="line"><span class="cl">    if vendor-info and model_name:
</span></span><span class="line"><span class="cl">        break
</span></span><span class="line"><span class="cl">#  if &#34;AMD&#34; in vendor_info and &#34;AMD EPYC&#34; not in model_name: &lt;---here
</span></span><span class="line"><span class="cl">#      self.error_exit(&#34;Unsupported CPU: %s&#34; % model_name)   &lt;---here
</span></span></code></pre></div><p>You just need to comment out the CPU check. After that, the Edge must be rebooted and our NSX will now work.</p>
<p>The problem is upgrading or redeploying the Edge VMs. You only have about a minute after the Edge has been deployed to adjust the script.
That&rsquo;s why I&rsquo;ll be deploying my Edges via OVA in the future and using them on an Intel host. That&rsquo;s the easiest way for me.</p>
<h2 id="conclusion---is-the-ms-a2-the-perfect-home-server">Conclusion - Is the MS-A2 the perfect home server?</h2>
<p>I would give a clear yes and no. It depends greatly on what you intend to do with it. If you want to deploy VCF without workarounds and technical challenges, then I would definitely say go for the MS-01 and buy two of them. If you are using Proxmox or another hypervisor, then the MS-A2 is a no-brainer – buy it and be happy.
Both the MS-01 and MS-A2 are excellent compact systems for homelab use, but the MS-A2 clearly sets a new bar.
With twice the Cinebench score and significantly higher throughput on the internal network, this box is the clear winner for me.
However, I don&rsquo;t mind DIY solutions and unsupported solutions. I can completely understand if someone prefers to buy a home server on which they can run vcf without any tinkering—but that&rsquo;s just not my thing.</p>
<h2 id="maybe-one-more-thinghow-much-does-it-cost">Maybe one more thing—how much does it cost?</h2>
<p>Since I haven&rsquo;t had time to update my Lab BOM yet, I&rsquo;ll briefly mention the costs here. I ordered the MS-A2 from Amazon because it was available immediately and I couldn&rsquo;t wait for the one I actually ordered from Minisforum to arrive.
I paid €1,395.88 for the MS-A2 with 2 TB NVMe and 128 GB RAM. On top of that, there&rsquo;s a custom rack mount for €92, but I&rsquo;m not counting that in the total cost. At €312, the RAM isn&rsquo;t exactly cheap and is still hard to come by in some places. But I think that&rsquo;s the price you pay for being an early adopter.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">And as always, I pay for the fun out of my own pocket, as I do not place any advertisements nor am I sponsored by any company.</div>
    </aside>
]]></content>
		</item>
		
		<item>
			<title>ReSTNSX with HCX</title>
			<link>https://sdn-warrior.org/posts/restnsx-with-hcx/</link>
			<pubDate>Tue, 20 May 2025 18:00:16 +0200</pubDate>
			
			<guid>https://sdn-warrior.org/posts/restnsx-with-hcx/</guid>
			<description><![CDATA[How can ReSTNSX help me with the migration with HCX]]></description>
			<content type="html"><![CDATA[<h2 id="introduction">Introduction</h2>
<p>I have already written about my experiences with HCX in the past and as brilliant as the tool is for the migration of different workloads, the rework with the distributed firewall can be painful. Because what HCX can&rsquo;t do is move firewall rules or security objects. Fortunately, there is a solution for this: ReSTNSX.</p>
<p>But what exactly is ReSTNSX? ReSTNSX describes itself as follows: The ReSTNSX platform simplifies consumption of VMware&rsquo;s NSX API to increase visibility and reduce the risk of errors. Our mission is to enable administrators to easily configure and operate NSX with a straightforward and intuitive user experience.
We will see in this post whether it is really that simple.</p>
<p>A quick note about transparency: I was provided with a free demo license, but I am not receiving any money and this does not influence my opinion.
In this post, I’ll walk you through how RESTNSX complements HCX, especially when migrating applications with strict security policies or complex network requirements.</p>
<h2 id="why-combine-hcx-with-restnsx">Why Combine HCX with ReSTNSX?</h2>
<p>While HCX handles the <strong>heavy lifting</strong> of vMotion, replication, and network extension, it doesn’t touch NSX security rules or segment tagging. That’s where RESTNSX comes in.
ReSTNSX can access various systems via API, including Checkpoint and Palo Alto firewalls (which are not the focus of this article), as well as external resources such as Active Directory and SMTP servers. The architecture is relatively simple: there is a Cloud Control OVA, and that&rsquo;s it. It really is that simple.</p>
<figure><a href="01.png"><picture><source srcset="/restnsx-hcx/01_hu_ac2c28ccbe374737.png" type="image/png">
          <img
            src="/restnsx-hcx/01_hu_ac2c28ccbe374737.png"alt="ReSTNSX"width="1476"
            height="783"/>
        </picture></a><figcaption><p>ReSTNSX Architecture taken from ReSTNSX Website (click to enlarge)</p></figcaption></figure>
<p>Some key benefits:</p>
<ul>
<li>Automatically replicate security policies to the destination environment</li>
<li>Tag migrated VMs for policy-based firewall rules</li>
<li>Clean up or reassign segments and groups after cutover</li>
<li>Automate post-migration tasks (e.g., updating DFW rules, creating new groups)</li>
</ul>
<h2 id="getting-started">Getting started</h2>
<p>To get started, you first need to deploy the OVA and register the product. The process is pretty straightforward, so I won&rsquo;t go into detail here.
Once the CloudControl OVA has been successfully deployed and registered, you need to register the data sources. To do this, go to Administration - Data Sources.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">The default credentials are admin and the password is default, which should of course be changed.</div>
    </aside>
<figure><a href="02.png"><picture><source srcset="/restnsx-hcx/02_hu_def51b1f8c08029d.png" type="image/png">
          <img
            src="/restnsx-hcx/02_hu_def51b1f8c08029d.png"alt="ReSTNSX"width="1383"
            height="626"/>
        </picture></a><figcaption><p>Datasources in ReSTNSX (click to enlarge)</p></figcaption></figure>
<p>As you can see here, I have set up two NSX environments as data sources. NSXB is a pure NSX lab, and VCF02 is my VCF stretched cluster, which I have already discussed in my blog.</p>
<figure><a href="03.png"><picture><source srcset="/restnsx-hcx/03_hu_107bd90accd3d9ff.png" type="image/png">
          <img
            src="/restnsx-hcx/03_hu_107bd90accd3d9ff.png"alt="ReSTNSX"width="1393"
            height="349"/>
        </picture></a><figcaption><p>Datasources in ReSTNSX (click to enlarge)</p></figcaption></figure>
<p>Then I got this test setup ready:</p>
<h2 id="restnsx--hcx-test-scenario">RESTNSX &amp; HCX Test Scenario</h2>
<p>The migration scenario involves moving workloads from a source environment named <strong>NSXB</strong> to a destination environment <strong>VCF02</strong>.
The VCF02 NSX environment initially contains no custom firewall rules, tags, or groups—only the default settings provided by NSX.</p>
<h3 id="initial-environment-nsxb">Initial Environment (NSXB)</h3>
<p>In the source NSXB environment, we have the following setup:</p>
<ul>
<li><strong>Tag:</strong> <code>dfg_alpine</code></li>
<li><strong>Security Group:</strong> <code>dfg_all_Alpine</code></li>
<li><strong>Tagged VMs:</strong> <code>Alpine1</code>, <code>Alpine2</code>, <code>Alpine3</code>, and <code>Alpine4</code></li>
</ul>
<p>A custom firewall policy named <strong>ReSTNSX-Policy</strong> is configured as follows:</p>
<ol>
<li>
<p><strong>Allow ICMP to Alpine VMs</strong><br>
Allow all ICMP traffic from any source to all VMs in the group <code>dfg_all_Alpine</code>.<br>
<strong>Applied to:</strong> All distributed firewall (DFW) VMs</p>
</li>
<li>
<p><strong>Allow HTTP and HTTPS to WAN</strong><br>
Allow HTTP and HTTPS traffic originating from VMs in the <code>dfg_all_Alpine</code> group to any external (non-RFC1918) destination (Internet).<br>
<strong>Applied to:</strong> <code>dfg_all_Alpine</code></p>
</li>
<li>
<p><strong>Allow SSH to Alpine VMs</strong><br>
Allow SSH traffic (TCP port 22) from any source to all VMs in the <code>dfg_all_Alpine</code> group.<br>
<strong>Applied to:</strong> <code>dfg_all_Alpine</code></p>
</li>
<li>
<p><strong>Allow iPerf3 Between Alpine VMs</strong><br>
Allow iPerf3 traffic (TCP port 5201) between all VMs within the <code>dfg_all_Alpine</code> group.<br>
<strong>Applied to:</strong> <code>dfg_all_Alpine</code></p>
</li>
<li>
<p><strong>Allow DNS from Alpine VMs</strong><br>
Allow DNS traffic (TCP and UDP port 53) from all VMs in the <code>dfg_all_Alpine</code> group to any destination.<br>
<strong>Applied to:</strong> <code>dfg_all_Alpine</code></p>
</li>
<li>
<p><strong>Drop All from Alpine VMs</strong><br>
Drop all other traffic originating from the VMs within the <code>dfg_all_Alpine</code> group to any destination.<br>
<strong>Applied to:</strong> All distributed firewall (DFW) VMs</p>
</li>
</ol>
<figure><a href="04.png"><picture><source srcset="/restnsx-hcx/04_hu_95d6dd1a85e88b4a.png" type="image/png">
          <img
            src="/restnsx-hcx/04_hu_95d6dd1a85e88b4a.png"alt="ReSTNSX Policy"width="1425"
            height="407"/>
        </picture></a><figcaption><p>ReSTNSX Firewall policy (click to enlarge)</p></figcaption></figure>
<h3 id="migration-of-the-firewall-rules-and-vms">Migration of the firewall rules and VMs</h3>
<p>I am migrating the workloads (<code>Alpine3-4</code>) seamlessly to the VCF02 environment using VMware HCX.
Of course, the VMs stop communicating after the migration because there are no firewall rules in the target environment yet, except for the default Layer 3, which is set to drop all on my system.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">It is important that I replicate the NSX Security TAGs with HCX when migrating.
Of course, this is only possible if the source system already has NSX and does not come from a legacy VMware environment.
But in my case, I save myself additional steps here. ReSTNSX can, of course, provide TAGs via the Policy Engine or Bulk Provisioning, but why make life unnecessarily difficult?</div>
    </aside>
<p>There are several ways to migrate my firewall ruleset to the target environment.
I could import a ruleset using bulk provisioning and a CSV file, but in my opinion, the charm of this solution lies in the Policy Engine.
Using the Policy Engine, which can be found under Tools in the menu, you can easily create a sync task that can synchronize a wide variety of tasks on a cyclical basis.</p>
<figure><a href="06.png"><picture><source srcset="/restnsx-hcx/06_hu_7122dd76231dd2ec.png" type="image/png">
          <img
            src="/restnsx-hcx/06_hu_7122dd76231dd2ec.png"alt="ReSTNSX Policy"width="1387"
            height="970"/>
        </picture></a><figcaption><p>Policy Engine in ReSTNSX (click to enlarge)</p></figcaption></figure>
<p>The screenshot shows my finished policy.
It is relatively simple. I have selected to create a policy of type dFW Sections.
By default, synchronization runs every 30 minutes when the policy is enabled.
Under NSX Manager, I specify my source and target NSX Manager, and in the dFW Section field, I set the source policy.
In my example, it is the previously created ReSTNSX policy from my NSXB environment. My destination is the VCF02 environment.
Under Synchronization Method, I can specify various options for what exactly should be synchronized.
Direction specifies the source and destination systems; a bi-directional sync is also possible.
Further up in the screenshot, you can see the result of the preview sync.
As you can see, sections, rules, security groups, static members, services, and context profiles are synchronized—that&rsquo;s quite a lot.
But first, a quick word about the synchronization method, as this has an impact on the actual implementation.</p>
<p>There are 4 synchronization options: Effective Members, Membership Criteria, Membership Criteria (MAT Mode), and Effective Members + Membership Criteria.</p>
<ul>
<li>
<p>Effective Members: This mode ensures that the realized members from the source group are written to the new group, which means that the new group receives a static IP assignment for its members.
In this case, the membership criterion is an IP, even if it was previously a dynamic group.
If the source VM does not have a realized IP, it is not part of the new group.</p>
</li>
<li>
<p>Membership Criteria: In this mode, only the criteria for how the groups receive their members are synchronized.
In the case of an IP-only group, the defined IPs are simply stored as criteria.
This mode allows us to obtain our dynamic groups, but if TAGs are used as criteria, they must be created in the target environment (or synchronized with HCX), otherwise the groups will remain empty.</p>
</li>
<li>
<p>Membership Criteria (MAT Mode): Only for V to T migration. Since I don&rsquo;t have NSX-V in use, I couldn&rsquo;t test anything here.</p>
</li>
<li>
<p>Effective Members + Membership Criteria: This is a combination of both methods. I have my dynamic TAG definition and my effective members. This can be used if you cannot synchronize TAGs and want to make sure that communication works instantly in the new environment. However, this requires post-processing steps and is not necessary in my setup.</p>
</li>
</ul>
<p>By default, ReSTNSX creates firewall sections, rules, and object names in the target environment that are identical to those in the source environment. This behavior can be changed when creating the policy.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">It is not possible to change the behavior after the first synchronization. The policy must be deleted and recreated. When deleting the policy, you can also delete all objects created by the policy.</div>
    </aside>
<figure><a href="07.png"><picture><source srcset="/restnsx-hcx/07_hu_94a8317950e81dab.png" type="image/png">
          <img
            src="/restnsx-hcx/07_hu_94a8317950e81dab.png"alt="NSX Firewall Policy"width="1701"
            height="922"/>
        </picture></a><figcaption><p>Firewall Policy in the target NSX (click to enlarge)</p></figcaption></figure>
<p>A policy does not have to be activated; you can also start it manually with a forced sync and use this policy as a one-time synchronization. For long migration scenarios, performing the synchronization cyclically is invaluable, especially if there are rule adjustments during the migration phase.
After successful synchronization, the complete policy is available 1:1 in my VCF02 and the migrated VMs alpine 3 and 4 are automatically in the dFG_all_Alpine group, as they retained their NSX security tags during the HCX migration and the groups created by ReSTNSX use the same definitions as in the source environment.</p>
<figure><a href="08.png"><picture><source srcset="/restnsx-hcx/08_hu_4040506d127059ad.png" type="image/png">
          <img
            src="/restnsx-hcx/08_hu_4040506d127059ad.png"alt="NSX Firewall groups"width="1473"
            height="1240"/>
        </picture></a><figcaption><p>Firewall groups in the target NSX (click to enlarge)</p></figcaption></figure>
<h2 id="additional-features-of-restnsx">Additional features of ReSTNSX</h2>
<p>ReSTNSX is not just a simple migration tool, although that was certainly the first use that came to mind. It can do much more than that. ReSTNSX has a very good, customizable dashboard that provides me with long-term information about my NSX environment.</p>
<figure><a href="05.png"><picture><source srcset="/restnsx-hcx/05_hu_f166d5db24afbd14.png" type="image/png">
          <img
            src="/restnsx-hcx/05_hu_f166d5db24afbd14.png"alt="Restnsx Dashboard"width="1692"
            height="880"/>
        </picture></a><figcaption><p>ReSTNSX Dashboard (click to enlarge)</p></figcaption></figure>
<p>Of course, the dashboard in my lab doesn&rsquo;t show much, because I don&rsquo;t have Aria Operations for Networks running, and I also shut down my labs after successful testing so as not to drive up my electricity bill even more.
ReSTNSX also features a comprehensive reporting system and enables NSX documentation and auditing. Bulk Provisioning is a powerful tool for manipulating, creating, and detaching security tags, groups, firewall rules, and other items. You can even clone an entire NSX environment.</p>
<h2 id="conclusion">Conclusion</h2>
<p>ReSTNSX is not just a simple migration tool, it also accelerates Day 2 operations in NSX and VCF. Combined with HCX, it has never been easier to migrate existing microsegmented workloads from one workload domain to a new workload domain or VCF instance. ReSTNSX uses APIs from vCenter and NSX and offers a centralized CLI. It also offers true multi-cloud management (VMware Cloud on AWS (VMCoAWS), Google (GCVE), Azure, Oracle, and IBM). I&rsquo;m sure I&rsquo;ve only scratched the surface, as some things are not so easy to test, especially when it comes to multi-cloud, but I hope I&rsquo;ve been able to provide some insight into a cool tool that can make everyday life and migration tasks easier.</p>
]]></content>
		</item>
		
		<item>
			<title>NSX Expiring Transport Node Certificates</title>
			<link>https://sdn-warrior.org/posts/nsx-expiring-tn-certificates/</link>
			<pubDate>Tue, 06 May 2025 18:00:00 +0200</pubDate>
			
			<guid>https://sdn-warrior.org/posts/nsx-expiring-tn-certificates/</guid>
			<description><![CDATA[On versions NSX 4.1.x and 4.2.0, Edge and Host Transport Nodes are instantiated using a certificate with validity period of 825 days.
NSX-T 3.x and NSX 4.2.1 and higher create Transport Nodes using a certificate with validity period of 10 years.
The Transport Node certificate used at create time is not replaced on upgrade.
Any Edge that may have been deployed on these versions or any Hosts prepared or re-prepared on these versions will have this shorter validity period certificate.]]></description>
			<content type="html"><![CDATA[<h2 id="introduction">Introduction</h2>
<p>In NSX versions <strong>4.1.x</strong> and <strong>4.2.0</strong>, <strong>edge transport nodes</strong> and <strong>host transport nodes</strong> are instantiated using certificates with a <strong>validity period of only 825 days</strong>.
This is obviously not desirable behavior and has been fixed in newer versions of NSX. Interestingly, I haven&rsquo;t seen anything about this in the changelog.
In NSX versions 3.X and 4.2.1 and higher, these certificates are valid for 10 years. But what exactly does that mean?</p>
<h2 id="the-problem">The Problem</h2>
<p>First of all, don&rsquo;t panic. The affected certificates are not visible in the GUI, but NSX Manager will issue a warning and display an error 30 days in advance. If you don&rsquo;t see these messages in your NSX, you have more than 30 days to respond.
Furthermore, it only affects transport nodes deployed with NSX 4.1.x - 4.2. If you have upgraded from NSX 3.X to one of the affected versions or have just deployed VCF version 5.2.x, you are safe.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">The certificate issued at the time of Transport Node creation <strong>is not replaced automatically during an upgrade</strong>.</div>
    </aside>
<h2 id="why-it-matters">Why It Matters</h2>
<p>As the 825-day period approaches its end, you may encounter <strong>certificate expiration issues</strong>, potentially affecting NSX component communication and overall platform stability.

    <aside class="admonition Warning">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" class="feather feather-link" width="24" height="24" viewBox="0 0 24 24"
      fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
      <path d="M10 13a5 5 0 0 0 7.54.54l3-3a5 5 0 0 0-7.07-7.07l-1.72 1.71"></path>
      <path d="M14 11a5 5 0 0 0-7.54-.54l-3 3a5 5 0 0 0 7.07 7.07l1.71-1.71"></path>
   </svg></div><b>Warning</b>
        </div>
        <div class="admonition-content">Once the TransportNode certificate expires, there is a grace period of only 24 hours after which all impacted Edges and Hosts will be disconnected from NSX.</div>
    </aside></p>
<p>If you are affected, there are two scenarios.</p>
<ul>
<li>Scenario A) The certificates will expire sometime in the near future, you may have received a warning in NSX Manager and therefore found my blog, or you are not yet aware of the problem because the certificates will expire sometime between 31 and 825 days from now.</li>
<li>Scenario B) The certificates have already expired and transport nodes are disconnected from NSX.</li>
</ul>
<p>There is a solution for both scenarios. But let&rsquo;s start with the simple scenario A first.</p>
<h2 id="find-the-problem">Find the problem</h2>
<p>If you don&rsquo;t know which NSX version you installed and want to check whether you are affected by this issue, Broadcom has a relatively simple solution. There is a script called CARR (Certificate Analyzer, Results and Recovery) that can be easily run on NSX Manager (NSX 4.1.x to 4.2) or on an external client (other NSX versions).
Download <a href="https://knowledge.broadcom.com/external/article/369034">Certificate Analyzer, Results and Recovery (CARR)</a>from the official KB article.</p>
<p>In all cases, the script requires the following ports to be open between the client machine and the 3 NSX Managers</p>
<ul>
<li>ssh port 22</li>
<li>https port 443</li>
<li>corfu port 9000</li>
</ul>
<p>If running on the NSX Manager directly, ports 443 and 9000 will already be open between the 3 Managers.</p>
<p>If you want to run the script on the NSX manager, then the script must be executed directly from the /root directory.
To do this, copy the script to your NSX Manager or Global NSX Manager using sftp and unzip it with tar and run the script.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-shell" data-lang="shell"><span class="line"><span class="cl">tar -xvf carr-1.15.tar.gz
</span></span><span class="line"><span class="cl"><span class="nb">cd</span> carr-1.15
</span></span><span class="line"><span class="cl">./start.sh -d
</span></span></code></pre></div>
    <aside class="admonition Warning">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" class="feather feather-link" width="24" height="24" viewBox="0 0 24 24"
      fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
      <path d="M10 13a5 5 0 0 0 7.54.54l3-3a5 5 0 0 0-7.07-7.07l-1.72 1.71"></path>
      <path d="M14 11a5 5 0 0 0-7.54-.54l-3 3a5 5 0 0 0 7.07 7.07l1.71-1.71"></path>
   </svg></div><b>Warning</b>
        </div>
        <div class="admonition-content">The CARR script supports multiple parameters. Always perform an NSX backup first and, above all, run the script in dryrun mode. Dryrun mode executes the script in read-only mode.</div>
    </aside>
<p>Script options:</p>
<ul>
<li>
<p>-o = this flag is used to force online mode</p>
</li>
<li>
<p>-t = specify lead time for expiring certificates, between 31 and 825 days.</p>
</li>
<li>
<p>-d = dryrun</p>
</li>
</ul>
<p>I ran the script with the default settings (lead time for expiring 825 days) and the output was as follows:</p>
<h3 id="carr-script-validation-report">CARR Script Validation Report</h3>
<table>
  <thead>
      <tr>
          <th>Certificate Checks</th>
          <th>Validation Results</th>
          <th>Probable Fix</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>API</td>
          <td>WARNING: Certificate is expiring in 680 days<!-- raw HTML omitted -->SUCCESS: 10.28.0.3</td>
          <td>Certificate is CA signed. Customer should import the new CA signed certificate.</td>
      </tr>
      <tr>
          <td>VIP</td>
          <td>WARNING: Certificate is expiring in 680 days<!-- raw HTML omitted -->SUCCESS: 10.28.0.3</td>
          <td>Certificate is CA signed. Customer should import the new CA signed certificate.</td>
      </tr>
      <tr>
          <td>STALE-CERTIFICATES</td>
          <td>SUCCESS: No stale certificates found.</td>
          <td></td>
      </tr>
      <tr>
          <td>APH_AR</td>
          <td>SUCCESS: 10.28.0.3</td>
          <td></td>
      </tr>
      <tr>
          <td>COMPUTE_MANAGER</td>
          <td>VC(CM): vcf02-vcsa.lab.home: SUCCESS: No issue with certificates found.</td>
          <td></td>
      </tr>
      <tr>
          <td>LOCAL-MANAGER-PI</td>
          <td>The NSX-manager is not federated. Skipping Local Manager cert validations</td>
          <td></td>
      </tr>
      <tr>
          <td>SITE-TO-SITE</td>
          <td>The NSX-manager is not federated. Skipping APH_AR cert validations</td>
          <td></td>
      </tr>
      <tr>
          <td>HOST</td>
          <td>No Host certificate is expiring or has expired</td>
          <td></td>
      </tr>
      <tr>
          <td>EDGE</td>
          <td>No EDGE node certificate is expiring or has expired</td>
          <td></td>
      </tr>
      <tr>
          <td>CCP</td>
          <td>SUCCESS: 10.28.0.3</td>
          <td></td>
      </tr>
      <tr>
          <td>APH_TN</td>
          <td>SUCCESS: 10.28.0.3</td>
          <td></td>
      </tr>
      <tr>
          <td>CBM_CLUSTER_MANAGER</td>
          <td>SUCCESS: 10.28.0.3</td>
          <td></td>
      </tr>
      <tr>
          <td>CBM_CORFU</td>
          <td>SUCCESS: 10.28.0.3</td>
          <td></td>
      </tr>
  </tbody>
</table>
<p><em>All validations done.</em></p>
<p>As you can see from the output for HOST and EDGE, my installation is not affected. However, it shows that my API and VIP certificate will expire in 680 days. CARR is also smart enough to recognize that it is a certificate signed by a CA and not a certificate issued by NSX itself.</p>

    <aside class="admonition Warning">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" class="feather feather-link" width="24" height="24" viewBox="0 0 24 24"
      fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
      <path d="M10 13a5 5 0 0 0 7.54.54l3-3a5 5 0 0 0-7.07-7.07l-1.72 1.71"></path>
      <path d="M14 11a5 5 0 0 0-7.54-.54l-3 3a5 5 0 0 0 7.07 7.07l1.71-1.71"></path>
   </svg></div><b>Warning</b>
        </div>
        <div class="admonition-content">If I hadn&rsquo;t run the script in dry run mode, the script would have replaced my API and VIP certificates from NSX Manager with its own self-signed certificates. However, this is not desirable in this case. Caution is advised here, as it is easy to accidentally replace certificates that you do not want to replace.</div>
    </aside>
<p>To trigger TN cert replacement, environment details must be populated in a pre-existing file validation_config.yaml. This yaml file is located in the same folder as start.sh.
My example yaml. I have disabled validation of all certificates except HOST and EDGE.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="c"># user interface to provide the validation config</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c"># user can specify if any certificate validation needs to be skipped.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c"># by default all certificate types will be validated.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c"># For Hosts , the vCenter cluster names for host must be specified. Script will validate hosts in those clusters only.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c"># For Edge node, Edge cluster name must be specified. Script will validate edge nodes in those clusters only.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">HOST</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">validate</span><span class="p">:</span><span class="w"> </span><span class="kc">True</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">clusters</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">vcenter_name</span><span class="p">:</span><span class="w"> </span><span class="l">vcf02-vcsa.lab.home</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">vcenter_cluster_name</span><span class="p">:</span><span class="w"> </span><span class="l">sfo-m01-cluster-001</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">EDGE</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">validate</span><span class="p">:</span><span class="w"> </span><span class="kc">True</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">clusters</span><span class="p">:</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">cl01</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">API</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">validate</span><span class="p">:</span><span class="w"> </span><span class="kc">False</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">VIP</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">validate</span><span class="p">:</span><span class="w"> </span><span class="kc">False</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">CBM-FILE-PERMISSIONS</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">validate</span><span class="p">:</span><span class="w"> </span><span class="kc">False</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">CBM_CORFU</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">validate</span><span class="p">:</span><span class="w"> </span><span class="kc">False</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">CBM_CLUSTER_MANAGER</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">validate</span><span class="p">:</span><span class="w"> </span><span class="kc">False</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">CORFU_SERVER</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">validate</span><span class="p">:</span><span class="w"> </span><span class="kc">False</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">CORFU_CLIENTS</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">validate</span><span class="p">:</span><span class="w"> </span><span class="kc">False</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">LOCAL-MANAGER-PI</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">validate</span><span class="p">:</span><span class="w"> </span><span class="kc">False</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">GLOBAL-MANAGER-PI</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">validate</span><span class="p">:</span><span class="w"> </span><span class="kc">False</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">STALE-CERTIFICATES</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">validate</span><span class="p">:</span><span class="w"> </span><span class="kc">False</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">APH_TN</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">validate</span><span class="p">:</span><span class="w"> </span><span class="kc">False</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">APH_AR</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">validate</span><span class="p">:</span><span class="w"> </span><span class="kc">False</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">CCP</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">validate</span><span class="p">:</span><span class="w"> </span><span class="kc">False</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">SITE-TO-SITE</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">validate</span><span class="p">:</span><span class="w"> </span><span class="kc">False</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">COMPUTE_MANAGER</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">validate</span><span class="p">:</span><span class="w"> </span><span class="kc">False</span><span class="w">
</span></span></span></code></pre></div><p>When I now run the script without dry run, I see in the output that all certificates except HOST and EDGE are skipped. This effectively prevents certificates that you do not want to exchange from being exchanged.</p>
<h3 id="carr-script-validation-report-1">CARR Script Validation Report</h3>
<table>
  <thead>
      <tr>
          <th>Certificate Checks</th>
          <th>Validation Results</th>
          <th>Probable Fix</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>API</td>
          <td>Validation for the &lsquo;API&rsquo; is disabled in the input config file</td>
          <td></td>
      </tr>
      <tr>
          <td>VIP</td>
          <td>Validation for the &lsquo;VIP&rsquo; is disabled in the input config file</td>
          <td></td>
      </tr>
      <tr>
          <td>STALE-CERTIFICATES</td>
          <td>Validation for the &lsquo;STALE-CERTIFICATES&rsquo; is disabled in the input config file</td>
          <td></td>
      </tr>
      <tr>
          <td>APH_AR</td>
          <td>Validation for the &lsquo;APH_AR&rsquo; is disabled in the input config file</td>
          <td></td>
      </tr>
      <tr>
          <td>COMPUTE_MANAGER</td>
          <td>Validation for the &lsquo;COMPUTE_MANAGER&rsquo; is disabled in the input config file</td>
          <td></td>
      </tr>
      <tr>
          <td>LOCAL-MANAGER-PI</td>
          <td>Validation for the &lsquo;LOCAL-MANAGER-PI&rsquo; is disabled in the input config file</td>
          <td></td>
      </tr>
      <tr>
          <td>SITE-TO-SITE</td>
          <td>Validation for the &lsquo;SITE-TO-SITE&rsquo; is disabled in the input config file</td>
          <td></td>
      </tr>
      <tr>
          <td>HOST</td>
          <td>No Host certificate is expiring or has expired</td>
          <td></td>
      </tr>
      <tr>
          <td>EDGE</td>
          <td>No EDGE node certificate is expiring or has expired</td>
          <td></td>
      </tr>
      <tr>
          <td>CCP</td>
          <td>Validation for the &lsquo;CCP&rsquo; is disabled in the input config file</td>
          <td></td>
      </tr>
      <tr>
          <td>APH_TN</td>
          <td>Validation for the &lsquo;APH_TN&rsquo; is disabled in the input config file</td>
          <td></td>
      </tr>
      <tr>
          <td>CBM_CLUSTER_MANAGER</td>
          <td>Validation for the &lsquo;CBM_CLUSTER_MANAGER&rsquo; is disabled in the input config file</td>
          <td></td>
      </tr>
      <tr>
          <td>CBM_CORFU</td>
          <td>Validation for the &lsquo;CBM_CORFU&rsquo; is disabled in the input config file</td>
          <td></td>
      </tr>
  </tbody>
</table>
<h2 id="what-to-do-if-the-transport-nodes-are-already-disconnected">What to do if the transport nodes are already disconnected?</h2>
<p>There is a solution for this as well. Unfortunately, CARR can no longer be used for separate transport nodes. The solution to this problem is to manually generate new certificates on the transport node and then push them from the transport node to the NSX Manager.</p>

    <aside class="admonition Warning">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" class="feather feather-link" width="24" height="24" viewBox="0 0 24 24"
      fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
      <path d="M10 13a5 5 0 0 0 7.54.54l3-3a5 5 0 0 0-7.07-7.07l-1.72 1.71"></path>
      <path d="M14 11a5 5 0 0 0-7.54-.54l-3 3a5 5 0 0 0 7.07 7.07l1.71-1.71"></path>
   </svg></div><b>Warning</b>
        </div>
        <div class="admonition-content">In general, caution is advised when undertaking this task. Personally, I would always open a ticket with Broadcom, as errors can cause further damage in case of doubt. If an Edge Node is disconnected, you can also perform a redeployment via API, which may even be faster than troubleshooting with CARR or manually.
You can find out how to do this in my blog here: <a href="https://sdn-warrior.org/posts/nsx-edge-redeploy/">SDN-warrior - NSX Edge Redeploy Guide</a></div>
    </aside>
<ul>
<li>
<p>SSH to the Transport Node as root user</p>
</li>
<li>
<p>Empty Transport Node certificate and private key</p>
</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">cat /dev/null &gt; /etc/vmware/nsx/host-cert.pem
</span></span><span class="line"><span class="cl">cat /dev/null &gt; /etc/vmware/nsx/host-privkey.pem
</span></span></code></pre></div><ul>
<li>Generate a new self-signed TN certificate and key.</li>
</ul>
<h4 id="for-nsx-41x-versions-prior-to-4125">For NSX 4.1.x versions prior to 4.1.2.5:</h4>
<ul>
<li>Create a temporary openssl config file from the existing openssl config</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">cat /etc/vmware/nsx/openssl-proxy.cnf &gt; /tmp/tmp-openssl-proxy.cnf
</span></span></code></pre></div><ul>
<li>UUID is extracted and added to the temporary openssl config</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">echo &#34;UID = $(grep -o &#39;&lt;uuid&gt;[^&lt;]*&#39; /etc/vmware/nsx/host-cfg.xml | sed &#39;s/&lt;uuid&gt;//&#39;)&#34; &gt;&gt; /tmp/tmp-openssl-proxy.cnf
</span></span></code></pre></div><ul>
<li>Add extension in the temporary openssl config</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">echo -e &#34;[ req_ext ]\nbasicConstraints     = CA:FALSE\nextendedKeyUsage     = clientAuth\nsubjectKeyIdentifier = hash\nauthorityKeyIdentifier = keyid,issuer&#34; &gt;&gt; /tmp/tmp-openssl-proxy.cnf
</span></span></code></pre></div><ul>
<li>Replace the certificate, where below -days parameter specifies 3650 days (10 years) validity period</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">openssl req -new -newkey rsa:2048 -days 3650 -nodes -x509 -keyout /etc/vmware/nsx/host-privkey.pem -out /etc/vmware/nsx/host-cert.pem -config /tmp/tmp-openssl-proxy.cnf -extensions req_ext
</span></span></code></pre></div><h4 id="for-nsx-4125-and-higher">For NSX 4.1.2.5 and higher</h4>
<ul>
<li>restarting nsx-proxy restart creates the new cert-key pair:</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">/etc/init.d/nsx-proxy restart
</span></span></code></pre></div><ul>
<li>Identify NSX Manager thumbprint, ssh as admin user to NSX Manager</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">get certificate api thumbprint
</span></span></code></pre></div><p>To push the new cert-key pair to the Manager, from root user on the Host or Edge run (Any NSX Manager name or IP can be used)</p>
<h4 id="edge">Edge</h4>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">su admin -c push host-certificate &lt;Manager hostname-or-IP&gt; username admin thumbprint &lt;thumbprint from step 4&gt;
</span></span><span class="line"><span class="cl">Password for API user: &lt;enter admin password&gt;
</span></span></code></pre></div><h4 id="host">Host</h4>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">nsxcli -c push host-certificate &lt;Manager hostname-or-IP&gt; username admin thumbprint &lt;thumbprint from step 4&gt;
</span></span><span class="line"><span class="cl">Password for API user: &lt;enter admin password&gt;
</span></span></code></pre></div><p>The official Broadcom KB article can be found here: <a href="https://knowledge.broadcom.com/external/article/345802/alarm-for-transport-node-certificate-is.html">Broadcom: Alarm For Transport Node Certificate is About to Expire.</a></p>
<h2 id="conclusion">Conclusion</h2>
<p>At first glance, this problem sounds quite drastic, but if you have monitoring in place and regularly check the status of your NSX installation, the outage can be avoided relatively easily. The most important thing to know, however, is that upgrading your NSX version does <strong>not</strong> extend existing certificate lifetimes!</p>
]]></content>
		</item>
		
		<item>
			<title>VMUG Connect St. Louis 2025</title>
			<link>https://sdn-warrior.org/posts/vmug-connect-stl-2025/</link>
			<pubDate>Sat, 26 Apr 2025 16:00:10 -0500</pubDate>
			
			<guid>https://sdn-warrior.org/posts/vmug-connect-stl-2025/</guid>
			<description><![CDATA[Brief report on my VMUG Connect in St. Louis]]></description>
			<content type="html"><![CDATA[<h2 id="vmug-st-louis--between-tech-talks-and-travel-vibes">VMUG St. Louis – Between Tech Talks and Travel Vibes</h2>
<p>This week, I had the chance to combine two things I really enjoy – community and travel. Together with two colleagues from <a href="https://www.evoila.de">evoila</a>, I made my way to the VMUG Connect in St. Louis. While the trip started as a tech-focused event, it turned into much more than just work. Exploring a new city, meeting passionate people from the community, and exchanging ideas over a coffee (or two) made this a truly special experience. And when I write coffee, I usually mean beer. I guess I&rsquo;m German after all.</p>
<figure><picture><source srcset="/vmug-stl-25/vmug01_hu_86e1af604fe93993.jpg" type="image/jpeg">
          <img
            src="/vmug-stl-25/vmug01_hu_86e1af604fe93993.jpg"alt="Coffee at Half &amp; Half"width="1367"
            height="1387"/>
        </picture><figcaption><p>Coffee at Half &amp; Half</p></figcaption></figure>
<p>The agenda was packed: from VCF sessions to cloud-native talks and everything in between. But what stood out most to me wasn’t just the content – it was the people. Seeing familiar faces from previous events and meeting new folks who share the same enthusiasm for VMware technologies is what keeps me coming back to VMUG events. In the end, it&rsquo;s the community that makes the events special.</p>
<p>Of course, I couldn’t resist bringing my homelab brain along – some of the conversations sparked new ideas that I’ll definitely try out in my setup back home - Is that the VCF 9 Beta I hear clattering?.
Also, a big highlight: catching up with <a href="https://williamlam.com/">William Lam</a> in person – always a pleasure and a source of inspiration.</p>
<p>But before I get to the division of labor for the trip, I want to say a few words about St. Louis itself.</p>
<h2 id="stl">STL</h2>
<p>Since this was my first trip to the USA, all I can say is that I heard a lot of nonsense beforehand and, in my opinion, St. Louis is nowhere near as bad as its reputation.
On the contrary, I got lots of tips from local Reddit users and was welcomed very warmly by the people there. Yes, the US is different from Europe, but it would be strange if it weren&rsquo;t, considering that there are 7,388 km (no idea how many feet that is) as the crow flies between the bar where I&rsquo;m typing this and my home, and yes, I&rsquo;m not a budding unsuccessful writer, which is why I&rsquo;m in a bar and not Starbucks. But I&rsquo;m straying from the actual topic.</p>
<h2 id="beyond-tech--my-personal-highlights-in-st-louis">Beyond Tech – My Personal Highlights in St. Louis</h2>
<p>While the VMUG itself was a clear highlight, I also took some time to explore St. Louis – and the city definitely delivered. One of the first stops was the legendary <strong>Anheuser-Busch Brewery</strong>, where my colleague <a href="https://www.linkedin.com/in/andreas-mensing-a224b230/">Andreas Mensing</a> and I joined a guided tour. Besides learning more about the brewing process, the history of this iconic brand, and meeting the famous Clydesdales (Horses), we also enjoyed a well-earned <strong>free beer</strong> at the end - or maybe two. Who&rsquo;s going to count here—not me.
<em>Fun fact:</em> The brewery has been around since <strong>1852</strong> and still produces some of the most well-known beers in the U.S., including Budweiser. I have to say that I like Beer Hug (IPA) much better than Budweiser.</p>
<figure><picture><source srcset="/vmug-stl-25/vmug02_hu_9ca8392859297fe7.jpg" type="image/jpeg">
          <img
            src="/vmug-stl-25/vmug02_hu_9ca8392859297fe7.jpg"alt="Free beer"width="2142"
            height="1428"/>
        </picture><figcaption><p>free beer</p></figcaption></figure>
<p>Another must-see: <strong>the Gateway Arch</strong>. Standing at <strong>192 meters</strong>, it’s the tallest monument in the U.S., and the view from the top is absolutely worth the ride up. Right underneath it, there’s a <strong>free museum</strong> that covers the history of westward expansion – and speaking of free: most museums in the U.S. don’t charge admission, which is something I really appreciated as a visitor.</p>
<figure><picture><source srcset="/vmug-stl-25/vmug03_hu_b64f4545ba5e3d4c.jpg" type="image/jpeg">
          <img
            src="/vmug-stl-25/vmug03_hu_b64f4545ba5e3d4c.jpg"alt="Arch"width="2013"
            height="1509"/>
        </picture><figcaption><p>View from the top of the arch</p></figcaption></figure>
<p>One of the most unexpected gems was the <strong>City Museum</strong>. It’s hard to describe – part art installation, part adventure playground, part industrial fantasy - MadMax meets IKEA smaland. If the weather is good, I absolutely recommend a visit. Unfortunately, the rooftop area was closed during my stay, but the outdoor section was open and a lot of fun.<br>
It’s a place where you can easily spend hours – whether you&rsquo;re a kid or just a curious tech guy climbing through repurposed factory parts. However, you could get stuck in one of the tubes. On the other hand, you are never too old to slide. Regardless of all the things there are to climb or slide on, you can still get a good impression of the city&rsquo;s history here.</p>
<figure><picture><source srcset="/vmug-stl-25/vmug04_hu_b2cbd9f4d21989af.jpg" type="image/jpeg">
          <img
            src="/vmug-stl-25/vmug04_hu_b2cbd9f4d21989af.jpg"alt="City Museum"width="1809"
            height="1357"/>
        </picture><figcaption><p>City Museum</p></figcaption></figure>
<p>I can also recommend the zoo to anyone who likes zoos. Admission is free and the zoo is huge. Personally, I&rsquo;m a bit critical of zoos, but the grounds are in a beautiful location in Forest Park, where there are also two other museums.</p>
<p>Last but not least: <strong>Baseball!</strong><br>
I got the chance to attend a game between the <strong>Milwaukee Brewers and the St. Louis Cardinals</strong>. The atmosphere in <strong>Busch Stadium</strong> was electric, and the Cardinals secured a <strong>3–2 win at home</strong>. Great weather, beer and a real American sports vibe – the perfect way to wrap up the trip.</p>
<figure><picture><source srcset="/vmug-stl-25/vmug05_hu_35c7061efb58653b.jpg" type="image/jpeg">
          <img
            src="/vmug-stl-25/vmug05_hu_35c7061efb58653b.jpg"alt="Baseball"width="1072"
            height="1214"/>
        </picture><figcaption><p>Baseball with the Crew</p></figcaption></figure>
<p>I could write a lot more, but this is supposed to be about VMUG Connect, not just my vacation fun.</p>
<h2 id="vmug-st-louis--day-1-preconnect--expert-exchange">VMUG St. Louis – Day 1: PreConnect &amp; Expert Exchange</h2>
<p>Day 1 kicked off with the <strong>PreConnect sessions</strong>, a more intimate format focused on direct exchange between customers, Broadcom experts, and fellow vExperts. The theme was simple but powerful: <em>Meet the Experts</em>.</p>
<p>There were two sessions – one centered around <strong>Private AI</strong> and the second focused on <strong>VMware Cloud Foundation (VCF)</strong>. Both were structured more like open Q&amp;A panels than formal presentations, with lots of room for discussion and live feedback.</p>
<p>Personally, I really enjoyed the VCF session. It wasn’t just informative – it was also a great opportunity to exchange ideas with others working on similar challenges and projects.
Afterwards, there was a welcome reception at the Anheuser Busch brewery. Since I was already familiar with the tour, I decided to skip it and went straight to &ldquo;coffee&rdquo;, where I had some good conversations that weren&rsquo;t work-related for a change.</p>
<p>Something funny is that Germans always find each other automatically—it must be a natural phenomenon similar to magnetism—I don&rsquo;t know.</p>
<p>After that, we went to the afterparty, which pretty much summed up day 1 of Connect.</p>
<h2 id="vmug-st-louis--day-2-sessions-exams--a-personal-milestone">VMUG St. Louis – Day 2: Sessions, Exams &amp; a Personal Milestone</h2>
<p>Day 2 started right on time at <strong>8:00 AM</strong> with a shared breakfast, followed by the <strong>General Session at 9:00 AM</strong> led by <strong>Brenda Emerson</strong> and <strong>Brad Tompkins</strong>. A great way to start the event and get everyone energized for the day.</p>
<p>One nice addition to this event: <strong>free VCF certification exam slots</strong> were offered on-site. A great opportunity for those who hadn’t taken them yet – and yes, they were <strong>completely free</strong> for VMUG Connect attendees.</p>
<p>And speaking of certifications – something very special happened for me personally:<br>
Just as the VMUG officially started, I received confirmation that I’ve been accepted as a <strong>Broadcom Knight</strong>, specializing in <strong>NSX</strong>. I’m incredibly proud to now be a certified part of this community and grateful for the support from both <a href="https://www.evoila.de">evoila</a> and the Knight program.</p>
<p>The first session I joined was:<br>
<strong>Hypervisors: The Elephant in the SOC</strong><br>
<em>Austin Gadient, CTO &amp; Cofounder, Vali Cyber</em><br>
Focus was on securing <strong>ESXi hosts</strong>, which – let’s be honest – are often overlooked when it comes to security. It was a thought-provoking session and the product they presented definitely caught my attention.</p>
<p>After a short break, I joined:<br>
<strong>Simple Kubernetes Deployment with VMware Cloud Foundation (VCF)</strong><br>
<em>Kyle Gleed</em><br>
This one was more introductory and probably aimed at a different audience – so nothing new for me, but still good to see the topic get visibility.</p>
<p>Later on, I finally got to meet <strong><a href="https://www.linkedin.com/in/francisco-barragan/">Franky Barragan</a></strong> in person during his <strong>{code} session</strong>. I was genuinely happy to connect in real life – of course, we took a selfie right away! Dude, it was nice to meet you. Keep up all your Community work.</p>
<figure><picture><source srcset="/vmug-stl-25/vmug06_hu_455f3a97f75f50b1.jpg" type="image/jpeg">
          <img
            src="/vmug-stl-25/vmug06_hu_455f3a97f75f50b1.jpg"alt="Franky and i"width="1344"
            height="896"/>
        </picture><figcaption><p>Franky &amp; I</p></figcaption></figure>
<p>The <strong>session highlight of the day</strong> for me was:<br>
<strong>You Aren’t Ready: Stories from a Cyber Incident Survivor</strong><br>
<em>Steve Athanas</em><br>
His talk was packed with energy, personal stories, and a powerful message: most of us still don’t take <strong>cybersecurity</strong> seriously enough. It was a wake-up call – and a fantastic talk.</p>
<h2 id="bonus-the-power-of-knowledge-sharing">Bonus: The Power of Knowledge Sharing</h2>
<p>Another special moment for me was attending the session:<br>
<strong>The Power of Knowledge Sharing: Building Trust and Growing Your Career with VMUG</strong><br>
<em>Jens Klasen</em></p>
<p><a href="https://www.linkedin.com/in/jensklasen/">Jens</a>
is a respected former colleague and a good friend – and of course, I had to show my support by joining his session.<br>
(Not just because he used a <strong>picture of me</strong> in his presentation – although that definitely caught my attention 😄).</p>
<figure><picture><source srcset="/vmug-stl-25/vmug10_hu_3912203755ff74a1.jpg" type="image/jpeg">
          <img
            src="/vmug-stl-25/vmug10_hu_3912203755ff74a1.jpg"alt="Jens"width="1541"
            height="1026"/>
        </picture><figcaption><p>Jens Klasen</p></figcaption></figure>
<p>It’s always inspiring to see people you know sharing their stories and giving back to the community. His talk was authentic, practical, and a great reminder of how much <strong>knowledge sharing</strong> can impact your personal and professional growth and it remindeds me that I still have to prepare a session for the VMUG in Kaiserslautern, which he happens to be chairing.</p>
<p>We ended the evening at the hotel rooftop bar. A great second day full of impressions.</p>
<figure><picture><source srcset="/vmug-stl-25/vmug07_hu_76aec7c32117641e.jpg" type="image/jpeg">
          <img
            src="/vmug-stl-25/vmug07_hu_76aec7c32117641e.jpg"alt="The Germans and our fantastic Polish colleague from Evoila Poland"width="1794"
            height="1196"/>
        </picture><figcaption><p>The Germans and our fantastic Polish colleague from Evoila Poland</p></figcaption></figure>
<h2 id="vmug-st-louis--day-3-homelabs-heroes--honest-answers">VMUG St. Louis – Day 3: Homelabs, Heroes &amp; Honest Answers</h2>
<p>Just like the day before, Day 3 began with a shared breakfast at the venue – a nice way to reconnect and ease into the final day of the event.</p>
<p>My first session of the day was also one of my personal highlights:<br>
<strong>Homelabs Breakout Session</strong><br>
<em>William Lam, Distinguished Platform Engineering Architect</em></p>
<p>Finally getting to meet <strong><a href="https://williamlam.com/">William Lam</a></strong> in person was a big moment for me. He’s been one of my biggest motivators and sources of inspiration over the years.<br>
He actually had two slots that day: an <strong>exclusive AMA session</strong> for VMUG Advantage members and a second, open <strong>Homelab session</strong> for everyone.<br>
Needless to say – I got my <strong>selfie</strong> with him and i was on both sessions!</p>
<figure><picture><source srcset="/vmug-stl-25/vmug08_hu_5d5819cc8a225c21.jpg" type="image/jpeg">
          <img
            src="/vmug-stl-25/vmug08_hu_5d5819cc8a225c21.jpg"alt="William and i"width="1980"
            height="1320"/>
        </picture><figcaption><p>William &amp; I</p></figcaption></figure>
<p>Next up was:<br>
<strong>How to Be an Influencer</strong><br>
<em>Corey Romero, Senior Community (vExpert) Manager</em><br>
While it didn’t bring too much new for me, it was definitely entertaining and fun to watch.</p>
<p>The day wrapped up with a special <strong>General Session</strong> featuring <strong>Hock Tan</strong>, who took questions directly from the VMUG community. Sure, not every answer may have pleased everyone – but just the fact that the <strong>CEO of a multi-billion-dollar company</strong> shows up to a <strong>community-run event</strong> and takes live questions speaks volumes. It was a strong message: <strong>Broadcom is listening</strong>.</p>
<figure><picture><source srcset="/vmug-stl-25/vmug09_hu_57264e2d4c752248.jpg" type="image/jpeg">
          <img
            src="/vmug-stl-25/vmug09_hu_57264e2d4c752248.jpg"alt="Hock Tan and Chris McCain, Broadcom"width="1968"
            height="1312"/>
        </picture><figcaption><p>Hock Tan and Chris McCain, Broadcom</p></figcaption></figure>
<h2 id="final-thoughts">Final Thoughts</h2>
<p>Even though the technical depth of this event wasn’t quite on the same level as the <strong>Broadcom Knight Event in Amsterdam</strong>, the <strong>strongest asset of VMUG is the community</strong> and the connections you make.<br>
The exchange with others – whether it’s VMUG members, Broadcom folks, or fellow vExperts – is a huge part of what makes these events so valuable.</p>
<p>For me personally, the trip was absolutely worth it. I’m grateful for all the impressions, conversations, and moments I was able to collect. It’s hard to capture everything in a single blog post – especially when most of my content is usually about tech and homelab stuff. I’m pretty sure I forgot a bunch of things&hellip; and there are <strong>at least 300 photos</strong> I haven’t even shown.</p>
<p>Still, I have to wrap this up at some point – my MacBook battery is nearly dead, and I want to enjoy the last evening in St. Louis with Andreas. By the time you’re reading this, I’m probably already on the flight back home. I hope there will be more <strong>VMUG Connect</strong> events like this in the future – and that I’ll continue to be part of this amazing community.<br>
To share knowledge, to learn from others, and to keep growing – together.</p>
<p><strong>In this spirit: sharing is caring.</strong></p>
]]></content>
		</item>
		
		<item>
			<title>VCF - How to use the Broadcom Download Token</title>
			<link>https://sdn-warrior.org/posts/vcf-token/</link>
			<pubDate>Thu, 10 Apr 2025 22:00:00 +0200</pubDate>
			
			<guid>https://sdn-warrior.org/posts/vcf-token/</guid>
			<description><![CDATA[A short article on how to use the Broadcom Download Token.]]></description>
			<content type="html"><![CDATA[<h2 id="introduction">Introduction</h2>
<p>Broadcom is currently changing the access to their public repo. This affects many products. Affected are vCenter, ESXi Updates, vSAN File Services and of course VCF.
The changes are far-reaching. The repo URL has changed and also the way it can be accessed. With the username and password of the Broadcom account, access will no longer work from April 23, 2025. Time to take a look at the whole thing.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">Broadcom provides Powershell scripts via a KB article to make the changes, but there is also a manual way and I find it easier for VCF. Also, I don&rsquo;t need to write a blog to show how to start a Powershell script - which hopefully will work.</div>
    </aside>
<h2 id="what-is-changing-in-detail">What is changing in detail?</h2>
<p>First of all, the URL, this will be <em><strong>dl.broadcom.com</strong></em> for all the components mentioned above. this means that <em><strong>depot.vmware.com, hostupdate.vmware.com and app-updates.vmware.com</strong></em> will disappear and can therefore also be removed from your proxies if you use proxy and whitelistening.
The second change concerns authentication. Access will no longer be possible with your username and password. A download token must be created via the support portal. If you have multiple Site IDs, you must issue the token for the SiteID where you also have your contracts for the products you want to update.
After customizing the products, the user name and password are ignored.</p>
<p>As of today, there are unfortunately no patches that already support the new repo.</p>
<h2 id="customize-sddc-manager---the-manual-way">Customize SDDC Manager - the manual way</h2>
<p>To change the whole thing now, a specific file must be adapted. Like many things, you can change the application-prod.properties in the SDDC manager. This can be found in the following path:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">/opt/vmware/vcf/lcm/lcm-app/conf/application-prod.properties
</span></span></code></pre></div><p>To customize the file you have to log in to the SDDC Manager via VCF user using ssh and then get root rights using su.</p>
<p>The settings can be found under LCM DEPOT PROPERTIES. The following marked lines must be adjusted:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">################### LCM DEPOT PROPERTIES ########################
</span></span><span class="line"><span class="cl">lcm.depot.adapter.host=depot.vmware.com &lt;---change me
</span></span><span class="line"><span class="cl">lcm.depot.adapter.port=443
</span></span><span class="line"><span class="cl">lcm.depot.adapter.remote.rootDir=/PROD2 &lt;---change me
</span></span><span class="line"><span class="cl">lcm.depot.adapter.remote.repoDir=/evo/vmw &lt;---change me
</span></span><span class="line"><span class="cl">lcm.depot.adapter.local.baseDir=/nfs/vmware/vcf/nfs-mount/bundle/depot/local
</span></span><span class="line"><span class="cl">lcm.depot.adapter.enableBundleSignatureValidation=true
</span></span><span class="line"><span class="cl">lcm.depot.adapter.certificateCheckEnabled=true
</span></span><span class="line"><span class="cl">lcm.depot.adapter.remote.index.filename=index.v3
</span></span><span class="line"><span class="cl">lcm.depot.adapter.softwareCompatibilitySetsFile=softwareCompatibilitySets.json
</span></span><span class="line"><span class="cl">lcm.depot.adapter.partnerBundleMetadata.updated.filename=vxrailPartnerBundleMetadata.json
</span></span><span class="line"><span class="cl">lcm.depot.credential.file.path=/opt/vmware/vcf/etc/depot.cred
</span></span><span class="line"><span class="cl">lcm.depot.bundleElement.patchFile.checksumValidation=true
</span></span><span class="line"><span class="cl">lcm.depot.adapter.lcmManifestFile=lcmManifest.json
</span></span><span class="line"><span class="cl">lcm.depot.adapter.remote.lcmManifestDir=/evo/vmw/lcm/manifest &lt;---change me
</span></span><span class="line"><span class="cl">lcm.depot.adapter.remote.lcmProductVersionCatalogDir=/COMP/SDDC_MANAGER_VCF/lcm/productVersionCatalog &lt;---add me
</span></span></code></pre></div><p>The following values must be set:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">lcm.depot.adapter.host=dl.broadcom.com
</span></span><span class="line"><span class="cl">lcm.depot.adapter.remote.rootDir=/&lt;downloadToken&gt;/PROD
</span></span><span class="line"><span class="cl">lcm.depot.adapter.remote.repoDir=/COMP/SDDC_MANAGER_VCF
</span></span><span class="line"><span class="cl">lcm.depot.adapter.remote.lcmManifestDir=/COMP/SDDC_MANAGER_VCF/lcm/manifest
</span></span><span class="line"><span class="cl">lcm.depot.adapter.remote.lcmProductVersionCatalogDir=/COMP/SDDC_MANAGER_VCF/lcm/productVersionCatalog
</span></span></code></pre></div><p>After we have changed all the values, we only need to restart the lcm service.</p>

    <aside class="admonition danger">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-alert-triangle">
      <path d="M10.29 3.86L1.82 18a2 2 0 0 0 1.71 3h16.94a2 2 0 0 0 1.71-3L13.71 3.86a2 2 0 0 0-3.42 0z"></path>
      <line x1="12" y1="9" x2="12" y2="13"></line>
      <line x1="12" y1="17" x2="12.01" y2="17"></line>
   </svg></div><b>Snapshot</b>
        </div>
        <div class="admonition-content">It is strongly recommended to take a snapshot of the SDDC Manager before making any changes to the file.</div>
    </aside>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">systemctl restart lcm
</span></span></code></pre></div>
    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">After restarting the service, the SDDC manager displays an error message that the depot settings are not configured. a dummy username and a dummy password must be entered here. These values are not checked and not used, they are only there for the gui so that the validation works.</div>
    </aside>
<figure><picture><source srcset="/vcf-token/01_hu_92d7e9fc083b719.png" type="image/png">
          <img
            src="/vcf-token/01_hu_92d7e9fc083b719.png"alt="SDDC Manager"width="729"
            height="280"/>
        </picture><figcaption><p>SDDC Depot settings (click to enlarge)</p></figcaption></figure>
<h2 id="validation-sddc-manager">Validation SDDC Manager</h2>
<p>The easiest way to test it is of course to download a bundle. However, if there is no bundle available for download at the moment because everything has already been loaded, you can also check the debug log of the lcm. Since the SDDC searches cyclically for new bundles, calls to the new repo should be in the log.
The log can be found under /var/log/vmware/vcf/lcm/lcm-debug.log</p>

    <aside class="admonition danger">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-alert-triangle">
      <path d="M10.29 3.86L1.82 18a2 2 0 0 0 1.71 3h16.94a2 2 0 0 0 1.71-3L13.71 3.86a2 2 0 0 0-3.42 0z"></path>
      <line x1="12" y1="9" x2="12" y2="13"></line>
      <line x1="12" y1="17" x2="12.01" y2="17"></line>
   </svg></div><b>Caution</b>
        </div>
        <div class="admonition-content">The log contains the tokenid in plain text.</div>
    </aside>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-Plaintext" data-lang="Plaintext"><span class="line"><span class="cl">Getting file size for /COMP/SDDC_MANAGER_VCF/manifests/bundle-14555.manifest.sig from URL https://dl.broadcom.com:443/&lt;token&gt;/PROD/COMP/SDDC_MANAGER_VCF/manifests/bundle-14555.manifest.sig
</span></span></code></pre></div>
    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">Broadcom writes that after patching the SDDC Manager, the changes must be made again. I think that there will be an adjustment in the next release of the SDDC and username and password in the GUI will be replaced by download tokens. Before every update of the SDDC manager it is recommended to read the release notes.</div>
    </aside>
<h2 id="update-vcenter">Update vCenter</h2>
<p>The vCenter is very easy to setup. To do this, you must log in to vCenter Server Management (via port 5480) and log in with root. Under Update and then Settings you can enter a custom repo URL. Username and Password can be left empty as they are not needed.</p>
<p>The following URL must be entered:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-Plaintext" data-lang="Plaintext"><span class="line"><span class="cl">https://dl.broadcom.com/&lt;downloadToken&gt;/PROD/COMP/VCENTER/vmw/8d167796-34d5-4899-be0a-6daade4005a3/8.0.3.00400
</span></span></code></pre></div><h2 id="kb-article">KB Article</h2>
<p>Here is a summary KB article that links to all other KB articles for the other products. For some products, there is currently no scripted method. <a href="https://knowledge.broadcom.com/external/article/390098">Broadcom KB</a></p>
<h2 id="conclusion">Conclusion</h2>
<p>The manual way is relatively simple and quick if you do not want to execute the Powershell script from Broadcom for certain reasons. I have tested both variants and the Powershell script has the advantage that entries are validated. However, the execution policy must be changed to unrestricted for the Powershell script from Broadcom and there are customers who do not allow this. The manual method may then be preferable.</p>
]]></content>
		</item>
		
		<item>
			<title>VCF Quick Tip - WLD with single NSX Manager</title>
			<link>https://sdn-warrior.org/posts/vcf-single-nsx/</link>
			<pubDate>Tue, 01 Apr 2025 00:42:39 +0200</pubDate>
			
			<guid>https://sdn-warrior.org/posts/vcf-single-nsx/</guid>
			<description><![CDATA[Manipulate the SDDC Manager and Cloud Builder to enable deployment with only one NSX Manager.]]></description>
			<content type="html"><![CDATA[<h2 id="introduction">Introduction</h2>
<p>Here is a quick VCF tip for anyone who wants to deploy VCF but doesn&rsquo;t have endless resources in the lab. With a few minor adjustments, it is possible to deploy a VI Workload WLD or even the Management WLD with just one NSX manager.
We can also reduce the MGMT domain to three hosts. The only more economical way is through the convert and NFS. In this case, a management domain with two hosts is possible.</p>

    <aside class="admonition warning">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-alert-circle">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="8" x2="12" y2="12"></line>
      <line x1="12" y1="16" x2="12.01" y2="16"></line>
   </svg></div><b>Warning</b>
        </div>
        <div class="admonition-content">Please only use this in the lab. The setup is completely unsupported and should never be carried out in a productive environment.</div>
    </aside>
<h2 id="management-domain">Management Domain</h2>
<p>To reduce the management domain to 3 vSAN hosts, we have to log in to the Cloud Builder as admin via SSH and switch to root.
In the directory <em><strong>/etc/vmware/vcf/bringup/</strong></em>, application.properties must be adapted and the following line must be changed to the value 3.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">bringup.mgmt.cluster.minimum.size=3
</span></span></code></pre></div><p>After that, the bringup service must be restarted.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">systemctl restart vcf-bringup.service
</span></span></code></pre></div><p>After the adjustments, we can deploy a management domain with only three hosts – nice.
For this to work, the deployment must be done via a JSON. There is a good JSON generator from <a href="https://www.martingustafsson.com/vcf-ui-json/">Martin Gustafsson</a> (great tool) or you can upload the VMware Excel Sheet to the Cloudbuilder and use the built-in json generator.</p>
<h3 id="using-the-sos-utility-json-generator">Using the SoS Utility JSON Generator</h3>
<p>The JSON generator options within the SoS Utility provide a method to execute the creation of the JSON file from a completed deployment parameter workbook. To run the JSON generator, you must provide, as a minimum, a path to the deployment parameter workbook and the design type using the following syntax:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">/opt/vmware/sddc-support/sos --jsongenerator --jsongenerator-input /tmp/vcf-ems-deployment-parameter.xlsx --jsongenerator-design vcf-ems
</span></span></code></pre></div><table>
  <thead>
      <tr>
          <th>Option</th>
          <th>Description</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><code>--jsongenerator</code></td>
          <td>Invokes the JSON generator utility.</td>
      </tr>
      <tr>
          <td><code>--jsongenerator-input &lt;JSONGENERATORINPUT&gt;</code></td>
          <td>Specify the path to the input file to be used by the JSON generator utility. For example: <code>/tmp/vcf-ems-deployment-parameter.xlsx</code>.</td>
      </tr>
      <tr>
          <td><code>--jsongenerator-design vcf-ems</code></td>
          <td>Use <code>vcf-ems</code> for VMware Cloud Foundation.</td>
      </tr>
      <tr>
          <td><code>--jsongenerator-design vcf-vxrail</code></td>
          <td>Use <code>vcf-vxrail</code> for VMware Cloud Foundation on Dell VxRail.</td>
      </tr>
      <tr>
          <td><code>--jsongenerator-supress</code></td>
          <td>Supress confirmation to force cleanup directory. (optional)</td>
      </tr>
      <tr>
          <td><code>--jsongenerator-logs &lt;JSONGENERATORLOGS&gt;</code></td>
          <td>Specify the logs directory path.</td>
      </tr>
  </tbody>
</table>
<p>In order to deploy our management domain with only three ESX Servers and one NSX Manager, all we need to do is adjust the JSON so that it contains only three ESX Servers (the Bringup Excel expects four ESX Servers and three NSX Managers) and only one NSX Manager plus the NSX Manager VIP is available.</p>
<p>My sample JSON for my Workload Domain</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-json" data-lang="json"><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;deployWithoutLicenseKeys&#34;</span><span class="p">:</span> <span class="kc">true</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;skipEsxThumbprintValidation&#34;</span><span class="p">:</span> <span class="kc">true</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;managementPoolName&#34;</span><span class="p">:</span> <span class="s2">&#34;networkpool-001&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;sddcManagerSpec&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;secondUserCredentials&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;username&#34;</span><span class="p">:</span> <span class="s2">&#34;vcf&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;password&#34;</span><span class="p">:</span> <span class="s2">&#34;xxx&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="p">},</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;ipAddress&#34;</span><span class="p">:</span> <span class="s2">&#34;10.28.0.4&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;hostname&#34;</span><span class="p">:</span> <span class="s2">&#34;vcf02&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;rootUserCredentials&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;username&#34;</span><span class="p">:</span> <span class="s2">&#34;root&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;password&#34;</span><span class="p">:</span> <span class="s2">&#34;xxx&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="p">},</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;localUserPassword&#34;</span><span class="p">:</span> <span class="s2">&#34;xxx&#34;</span>
</span></span><span class="line"><span class="cl">  <span class="p">},</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;sddcId&#34;</span><span class="p">:</span> <span class="s2">&#34;vcf02-m01&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;esxLicense&#34;</span><span class="p">:</span> <span class="kc">null</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;workflowType&#34;</span><span class="p">:</span> <span class="s2">&#34;VCF&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;ceipEnabled&#34;</span><span class="p">:</span> <span class="kc">false</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;fipsEnabled&#34;</span><span class="p">:</span> <span class="kc">false</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;ntpServers&#34;</span><span class="p">:</span> <span class="p">[</span><span class="s2">&#34;192.168.12.1&#34;</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;dnsSpec&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;secondaryNameserver&#34;</span><span class="p">:</span> <span class="s2">&#34;192.168.100.254&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;subdomain&#34;</span><span class="p">:</span> <span class="s2">&#34;lab.home&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;domain&#34;</span><span class="p">:</span> <span class="s2">&#34;lab.home&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;nameserver&#34;</span><span class="p">:</span> <span class="s2">&#34;192.168.11.2&#34;</span>
</span></span><span class="line"><span class="cl">  <span class="p">},</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;networkSpecs&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;networkType&#34;</span><span class="p">:</span> <span class="s2">&#34;MANAGEMENT&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;subnet&#34;</span><span class="p">:</span> <span class="s2">&#34;10.28.1.0/24&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;gateway&#34;</span><span class="p">:</span> <span class="s2">&#34;10.28.1.1&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;vlanId&#34;</span><span class="p">:</span> <span class="s2">&#34;1001&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;mtu&#34;</span><span class="p">:</span> <span class="s2">&#34;1500&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;portGroupKey&#34;</span><span class="p">:</span> <span class="s2">&#34;SDDC-DPortGroup-Mgmt&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="p">},</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;networkType&#34;</span><span class="p">:</span> <span class="s2">&#34;VMOTION&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;subnet&#34;</span><span class="p">:</span> <span class="s2">&#34;10.28.2.0/24&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;gateway&#34;</span><span class="p">:</span> <span class="s2">&#34;10.28.2.1&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;vlanId&#34;</span><span class="p">:</span> <span class="s2">&#34;1002&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;mtu&#34;</span><span class="p">:</span> <span class="s2">&#34;1700&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;portGroupKey&#34;</span><span class="p">:</span> <span class="s2">&#34;SDDC-DPortGroup-vMotion&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;includeIpAddressRanges&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span> <span class="nt">&#34;endIpAddress&#34;</span><span class="p">:</span> <span class="s2">&#34;10.28.2.104&#34;</span><span class="p">,</span> <span class="nt">&#34;startIpAddress&#34;</span><span class="p">:</span> <span class="s2">&#34;10.28.2.101&#34;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">      <span class="p">]</span>
</span></span><span class="line"><span class="cl">    <span class="p">},</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;networkType&#34;</span><span class="p">:</span> <span class="s2">&#34;VSAN&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;subnet&#34;</span><span class="p">:</span> <span class="s2">&#34;10.28.3.0/24&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;gateway&#34;</span><span class="p">:</span> <span class="s2">&#34;10.28.3.1&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;vlanId&#34;</span><span class="p">:</span> <span class="s2">&#34;1003&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;mtu&#34;</span><span class="p">:</span> <span class="s2">&#34;1700&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;portGroupKey&#34;</span><span class="p">:</span> <span class="s2">&#34;SDDC-DPortGroup-VSAN&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;includeIpAddressRanges&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span> <span class="nt">&#34;endIpAddress&#34;</span><span class="p">:</span> <span class="s2">&#34;10.28.3.104&#34;</span><span class="p">,</span> <span class="nt">&#34;startIpAddress&#34;</span><span class="p">:</span> <span class="s2">&#34;10.28.3.101&#34;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">      <span class="p">]</span>
</span></span><span class="line"><span class="cl">    <span class="p">},</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;networkType&#34;</span><span class="p">:</span> <span class="s2">&#34;VM_MANAGEMENT&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;subnet&#34;</span><span class="p">:</span> <span class="s2">&#34;10.28.0.0/24&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;gateway&#34;</span><span class="p">:</span> <span class="s2">&#34;10.28.0.1&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;vlanId&#34;</span><span class="p">:</span> <span class="s2">&#34;1000&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;mtu&#34;</span><span class="p">:</span> <span class="s2">&#34;1700&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;portGroupKey&#34;</span><span class="p">:</span> <span class="s2">&#34;SDDC-DPortGroup-VM-Mgmt&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">  <span class="p">],</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;nsxtSpec&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;nsxtManagerSize&#34;</span><span class="p">:</span> <span class="s2">&#34;small&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;nsxtManagers&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">      <span class="p">{</span> <span class="nt">&#34;hostname&#34;</span><span class="p">:</span> <span class="s2">&#34;vcf02-m01-nsx01a&#34;</span><span class="p">,</span> <span class="nt">&#34;ip&#34;</span><span class="p">:</span> <span class="s2">&#34;10.28.0.3&#34;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="p">],</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;rootNsxtManagerPassword&#34;</span><span class="p">:</span> <span class="s2">&#34;xxx&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;nsxtAdminPassword&#34;</span><span class="p">:</span> <span class="s2">&#34;xxx&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;nsxtAuditPassword&#34;</span><span class="p">:</span> <span class="s2">&#34;xxx&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;vip&#34;</span><span class="p">:</span> <span class="s2">&#34;10.28.0.2&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;vipFqdn&#34;</span><span class="p">:</span> <span class="s2">&#34;vcf02-m01-nsx01&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;nsxtLicense&#34;</span><span class="p">:</span> <span class="kc">null</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;transportVlanId&#34;</span><span class="p">:</span> <span class="mi">1004</span>
</span></span><span class="line"><span class="cl">  <span class="p">},</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;vsanSpec&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;vsanDedup&#34;</span><span class="p">:</span> <span class="s2">&#34;false&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;esaConfig&#34;</span><span class="p">:</span> <span class="p">{</span> <span class="nt">&#34;enabled&#34;</span><span class="p">:</span> <span class="kc">false</span> <span class="p">},</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;datastoreName&#34;</span><span class="p">:</span> <span class="s2">&#34;m01-cluster-001-vsan&#34;</span>
</span></span><span class="line"><span class="cl">  <span class="p">},</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;dvsSpecs&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;dvsName&#34;</span><span class="p">:</span> <span class="s2">&#34;sfo-m01-vds1&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;vmnics&#34;</span><span class="p">:</span> <span class="p">[</span><span class="s2">&#34;vmnic0&#34;</span><span class="p">,</span> <span class="s2">&#34;vmnic1&#34;</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;mtu&#34;</span><span class="p">:</span> <span class="mi">1700</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;networks&#34;</span><span class="p">:</span> <span class="p">[</span><span class="s2">&#34;MANAGEMENT&#34;</span><span class="p">,</span> <span class="s2">&#34;VMOTION&#34;</span><span class="p">,</span> <span class="s2">&#34;VSAN&#34;</span><span class="p">,</span> <span class="s2">&#34;VM_MANAGEMENT&#34;</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;niocSpecs&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span> <span class="nt">&#34;trafficType&#34;</span><span class="p">:</span> <span class="s2">&#34;VSAN&#34;</span><span class="p">,</span> <span class="nt">&#34;value&#34;</span><span class="p">:</span> <span class="s2">&#34;HIGH&#34;</span> <span class="p">},</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span> <span class="nt">&#34;trafficType&#34;</span><span class="p">:</span> <span class="s2">&#34;VMOTION&#34;</span><span class="p">,</span> <span class="nt">&#34;value&#34;</span><span class="p">:</span> <span class="s2">&#34;LOW&#34;</span> <span class="p">},</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span> <span class="nt">&#34;trafficType&#34;</span><span class="p">:</span> <span class="s2">&#34;VDP&#34;</span><span class="p">,</span> <span class="nt">&#34;value&#34;</span><span class="p">:</span> <span class="s2">&#34;LOW&#34;</span> <span class="p">},</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span> <span class="nt">&#34;trafficType&#34;</span><span class="p">:</span> <span class="s2">&#34;VIRTUALMACHINE&#34;</span><span class="p">,</span> <span class="nt">&#34;value&#34;</span><span class="p">:</span> <span class="s2">&#34;HIGH&#34;</span> <span class="p">},</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span> <span class="nt">&#34;trafficType&#34;</span><span class="p">:</span> <span class="s2">&#34;MANAGEMENT&#34;</span><span class="p">,</span> <span class="nt">&#34;value&#34;</span><span class="p">:</span> <span class="s2">&#34;NORMAL&#34;</span> <span class="p">},</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span> <span class="nt">&#34;trafficType&#34;</span><span class="p">:</span> <span class="s2">&#34;NFS&#34;</span><span class="p">,</span> <span class="nt">&#34;value&#34;</span><span class="p">:</span> <span class="s2">&#34;LOW&#34;</span> <span class="p">},</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span> <span class="nt">&#34;trafficType&#34;</span><span class="p">:</span> <span class="s2">&#34;HBR&#34;</span><span class="p">,</span> <span class="nt">&#34;value&#34;</span><span class="p">:</span> <span class="s2">&#34;LOW&#34;</span> <span class="p">},</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span> <span class="nt">&#34;trafficType&#34;</span><span class="p">:</span> <span class="s2">&#34;FAULTTOLERANCE&#34;</span><span class="p">,</span> <span class="nt">&#34;value&#34;</span><span class="p">:</span> <span class="s2">&#34;LOW&#34;</span> <span class="p">},</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span> <span class="nt">&#34;trafficType&#34;</span><span class="p">:</span> <span class="s2">&#34;ISCSI&#34;</span><span class="p">,</span> <span class="nt">&#34;value&#34;</span><span class="p">:</span> <span class="s2">&#34;LOW&#34;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">      <span class="p">],</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;nsxtSwitchConfig&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;transportZones&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">          <span class="p">{</span> <span class="nt">&#34;name&#34;</span><span class="p">:</span> <span class="s2">&#34;vcf02-m01-tz-overlay01&#34;</span><span class="p">,</span> <span class="nt">&#34;transportType&#34;</span><span class="p">:</span> <span class="s2">&#34;OVERLAY&#34;</span> <span class="p">},</span>
</span></span><span class="line"><span class="cl">          <span class="p">{</span> <span class="nt">&#34;name&#34;</span><span class="p">:</span> <span class="s2">&#34;vcf02-m01-tz-vlan01&#34;</span><span class="p">,</span> <span class="nt">&#34;transportType&#34;</span><span class="p">:</span> <span class="s2">&#34;VLAN&#34;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">        <span class="p">]</span>
</span></span><span class="line"><span class="cl">      <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">  <span class="p">],</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;clusterSpec&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;clusterName&#34;</span><span class="p">:</span> <span class="s2">&#34;sfo-m01-cluster-001&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;clusterEvcMode&#34;</span><span class="p">:</span> <span class="kc">null</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;clusterImageEnabled&#34;</span><span class="p">:</span> <span class="kc">true</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;vmFolders&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;MANAGEMENT&#34;</span><span class="p">:</span> <span class="s2">&#34;vcf02-m01-fd-mgmt&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;NETWORKING&#34;</span><span class="p">:</span> <span class="s2">&#34;vcf02-m01-fd-nsx&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;EDGENODES&#34;</span><span class="p">:</span> <span class="s2">&#34;vcf02-m01-fd-edge&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="p">},</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;resourcePoolSpecs&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">      <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;name&#34;</span><span class="p">:</span> <span class="s2">&#34;m01-cluster-001-management-001&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;type&#34;</span><span class="p">:</span> <span class="s2">&#34;management&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;cpuReservationPercentage&#34;</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;cpuLimit&#34;</span><span class="p">:</span> <span class="mi">-1</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;cpuReservationExpandable&#34;</span><span class="p">:</span> <span class="kc">true</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;cpuSharesLevel&#34;</span><span class="p">:</span> <span class="s2">&#34;normal&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;cpuSharesValue&#34;</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;memoryReservationMb&#34;</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;memoryLimit&#34;</span><span class="p">:</span> <span class="mi">-1</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;memoryReservationExpandable&#34;</span><span class="p">:</span> <span class="kc">true</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;memorySharesLevel&#34;</span><span class="p">:</span> <span class="s2">&#34;normal&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;memorySharesValue&#34;</span><span class="p">:</span> <span class="mi">0</span>
</span></span><span class="line"><span class="cl">      <span class="p">},</span>
</span></span><span class="line"><span class="cl">      <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;name&#34;</span><span class="p">:</span> <span class="s2">&#34;m01-cluster-001-compute-002&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;type&#34;</span><span class="p">:</span> <span class="s2">&#34;compute&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;cpuReservationPercentage&#34;</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;cpuLimit&#34;</span><span class="p">:</span> <span class="mi">-1</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;cpuReservationExpandable&#34;</span><span class="p">:</span> <span class="kc">true</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;cpuSharesLevel&#34;</span><span class="p">:</span> <span class="s2">&#34;normal&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;cpuSharesValue&#34;</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;memoryReservationPercentage&#34;</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;memoryLimit&#34;</span><span class="p">:</span> <span class="mi">-1</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;memoryReservationExpandable&#34;</span><span class="p">:</span> <span class="kc">true</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;memorySharesLevel&#34;</span><span class="p">:</span> <span class="s2">&#34;normal&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;memorySharesValue&#34;</span><span class="p">:</span> <span class="mi">0</span>
</span></span><span class="line"><span class="cl">      <span class="p">},</span>
</span></span><span class="line"><span class="cl">      <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;name&#34;</span><span class="p">:</span> <span class="s2">&#34;m01-cluster-001-compute-003&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;type&#34;</span><span class="p">:</span> <span class="s2">&#34;compute&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;cpuReservationPercentage&#34;</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;cpuLimit&#34;</span><span class="p">:</span> <span class="mi">-1</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;cpuReservationExpandable&#34;</span><span class="p">:</span> <span class="kc">true</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;cpuSharesLevel&#34;</span><span class="p">:</span> <span class="s2">&#34;normal&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;cpuSharesValue&#34;</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;memoryReservationPercentage&#34;</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;memoryLimit&#34;</span><span class="p">:</span> <span class="mi">-1</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;memoryReservationExpandable&#34;</span><span class="p">:</span> <span class="kc">true</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;memorySharesLevel&#34;</span><span class="p">:</span> <span class="s2">&#34;normal&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;memorySharesValue&#34;</span><span class="p">:</span> <span class="mi">0</span>
</span></span><span class="line"><span class="cl">      <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="p">]</span>
</span></span><span class="line"><span class="cl">  <span class="p">},</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;pscSpecs&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;adminUserSsoPassword&#34;</span><span class="p">:</span> <span class="s2">&#34;xxx&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;pscSsoSpec&#34;</span><span class="p">:</span> <span class="p">{</span> <span class="nt">&#34;ssoDomain&#34;</span><span class="p">:</span> <span class="s2">&#34;vsphere.local&#34;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">  <span class="p">],</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;vcenterSpec&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;vcenterIp&#34;</span><span class="p">:</span> <span class="s2">&#34;10.28.0.5&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;vcenterHostname&#34;</span><span class="p">:</span> <span class="s2">&#34;vcf02-vcsa&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;vmSize&#34;</span><span class="p">:</span> <span class="s2">&#34;small&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;storageSize&#34;</span><span class="p">:</span> <span class="kc">null</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;rootVcenterPassword&#34;</span><span class="p">:</span> <span class="s2">&#34;xxx&#34;</span>
</span></span><span class="line"><span class="cl">  <span class="p">},</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;hostSpecs&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;association&#34;</span><span class="p">:</span> <span class="s2">&#34;sfo-m01-datacenter&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;hostname&#34;</span><span class="p">:</span> <span class="s2">&#34;vcf02-m01-esx01&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;credentials&#34;</span><span class="p">:</span> <span class="p">{</span> <span class="nt">&#34;username&#34;</span><span class="p">:</span> <span class="s2">&#34;root&#34;</span><span class="p">,</span> <span class="nt">&#34;password&#34;</span><span class="p">:</span> <span class="s2">&#34;xxx&#34;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="p">},</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;association&#34;</span><span class="p">:</span> <span class="s2">&#34;sfo-m01-datacenter&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;hostname&#34;</span><span class="p">:</span> <span class="s2">&#34;vcf02-m01-esx02&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;credentials&#34;</span><span class="p">:</span> <span class="p">{</span> <span class="nt">&#34;username&#34;</span><span class="p">:</span> <span class="s2">&#34;root&#34;</span><span class="p">,</span> <span class="nt">&#34;password&#34;</span><span class="p">:</span> <span class="s2">&#34;xxx&#34;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="p">},</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;association&#34;</span><span class="p">:</span> <span class="s2">&#34;sfo-m01-datacenter&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;hostname&#34;</span><span class="p">:</span> <span class="s2">&#34;vcf02-m01-esx03&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;credentials&#34;</span><span class="p">:</span> <span class="p">{</span> <span class="nt">&#34;username&#34;</span><span class="p">:</span> <span class="s2">&#34;root&#34;</span><span class="p">,</span> <span class="nt">&#34;password&#34;</span><span class="p">:</span> <span class="s2">&#34;xxx&#34;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">  <span class="p">]</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></div><h2 id="vi-workload-domain">VI Workload Domain</h2>
<p>In order to deploy a new workload domain with only one NSX manager, we have to manipulate the SDDC manager.
Unfortunately, it is not sufficient to simply generate a JSON with only one NSX manager.
However, the process is not particularly complicated.</p>
<p>We need to log in to the SDDC Manager via SSH. To do this, we need to use the VCF user. Then we switch to the root context with su and edit the following file:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">/etc/vmware/vcf/domainmanager/application-prod.properties
</span></span></code></pre></div><p>We have to add 3 lines of configuration.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">nsxt.manager.formfactor=medium
</span></span><span class="line"><span class="cl">nsxt.management.resources.validation.skip=true
</span></span><span class="line"><span class="cl">nsxt.manager.cluster.size=1
</span></span></code></pre></div><p>After that, we restart the domainmanager service and we are good to go.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">systemctl restart domainmanager.service
</span></span></code></pre></div><p>After the adjustments, we can deploy the workload domain with just one NSX manager via the API and a JSON file. I will describe in detail how exactly this works in a separate article.
Happy VCF deployment!</p>
]]></content>
		</item>
		
		<item>
			<title>VCF Stretched Cluster</title>
			<link>https://sdn-warrior.org/posts/vcf-stretched-cluster/</link>
			<pubDate>Thu, 13 Mar 2025 23:00:00 +0100</pubDate>
			
			<guid>https://sdn-warrior.org/posts/vcf-stretched-cluster/</guid>
			<description><![CDATA[A short article on how to build a stretched vSAN cluster in the VCF.]]></description>
			<content type="html"><![CDATA[<h2 id="introduction">Introduction</h2>
<p>This is my third article in my VCF Homelab series.
On second thought, I may have put the cart before the horse, because a stretched vSAN cluster requires a functioning management domain.
But ok, that shouldn&rsquo;t bother us now. I&rsquo;ll try to cover all the essentials of the deployment in this article.</p>
<h3 id="what-are-the-benefits-of-a-stretched-vsan-cluster-and-why-does-it-have-to-be-vsan">What are the benefits of a stretched vSAN cluster and why does it have to be vSAN?</h3>
<p>Well, first the obvious.
We have to use vSAN because we have a consolidated design here and VCF (version 5.2.1) does not allow any other principal storage in the consolidated design (also, this article is about a vSAN stretched cluster).
Secondly, we want to use the synchronous replication of vSAN to the second site. Another reason is “German Angst”. We want to have redundancy here and a muti AZ design. In Germany, there have traditionally been stretched clusters across multiple fire compartments or buildings for a long time. In the past, I myself have already implemented such scenarios with layer 2 over layer 3 (aka VxLAN).
The much more elegant way (and also the way supported by VCF) is the vSAN stretched cluster.</p>
<h2 id="lets-get-started">Let&rsquo;s get started</h2>
<p>Of course, as always, there is a flip side to the coin. We can&rsquo;t just stretch a workload domain and all is well.
This is certainly not possible via the GUI of the SDDC Manager.
Yes, that&rsquo;s right, we will work with the API. But more on that later.
If we think about using one or more stretched workload domains, this always means that we also have to stretch our management domain.
This gives us the following requirements:</p>
<ul>
<li>8 ESXi servers for our Management Domain (4 per AZ)</li>
<li>Minimum 6 ESXi servers for Workload Domain (8 recommanded minimum)</li>
<li>Both availability zones must contain an equal number of hosts to ensure failover in case any of the availability zones goes down.</li>
<li>Redundant L3 gateways</li>
<li>A set of VLANs</li>
<li>A vSAN Witness Host (ESA or OSA Wthness Appliance)</li>
</ul>
<p>You cannot stretch a cluster in the following conditions:</p>
<ul>
<li>The cluster is a vSAN Max cluster.</li>
<li>The cluster has a vSAN remote datastore mounted on it.</li>
<li>The cluster shares a vSAN Storage Policy with any other clusters.</li>
<li>The cluster includes DPU-backed hosts.</li>
</ul>
<p>Latency vSphere</p>
<ul>
<li>Less than 150 ms latency RTT for vCenter Server connectivity.</li>
<li>Less than 150 ms latency RTT for vMotion connectivity.</li>
<li>Less than 5 ms latency RTT for VSAN hosts connectivity.</li>
</ul>
<p>Latency vSAN Site to Witness</p>
<ul>
<li>Less than 200 ms latency RTT for up to 10 hosts per site.</li>
<li>Less than 100 ms latency RTT for 11-15 hosts per site.</li>
</ul>
<p>Latency NSX Managers</p>
<ul>
<li>Less than 10 ms latency RTT between NSX Managers</li>
<li>Less than 150 ms latency RTT between NSX Managers and transport nodes.</li>
</ul>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">I will only explain the stretching of the management domain in my article, as my resources in the lab are not infinite and the process is identical for the workload domain.</div>
    </aside>
<h2 id="vlans">VLANs</h2>
<table>
  <thead>
      <tr>
          <th>Function</th>
          <th style="text-align: center">Availability Zone 1</th>
          <th style="text-align: center">Availability Zone 2</th>
          <th style="text-align: center">HA Layer 3 Gateway</th>
          <th style="text-align: center">Recommended MTU</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>VM Management VLAN</td>
          <td style="text-align: center">✓</td>
          <td style="text-align: center">✓</td>
          <td style="text-align: center">✓</td>
          <td style="text-align: center">1500</td>
      </tr>
      <tr>
          <td>Management VLAN (AZ1)</td>
          <td style="text-align: center">✓</td>
          <td style="text-align: center">X</td>
          <td style="text-align: center">✓</td>
          <td style="text-align: center">1500</td>
      </tr>
      <tr>
          <td>vMotion VLAN</td>
          <td style="text-align: center">✓</td>
          <td style="text-align: center">X</td>
          <td style="text-align: center">✓</td>
          <td style="text-align: center">9000</td>
      </tr>
      <tr>
          <td>vSAN VLAN (AZ1)</td>
          <td style="text-align: center">✓</td>
          <td style="text-align: center">X</td>
          <td style="text-align: center">✓</td>
          <td style="text-align: center">9000</td>
      </tr>
      <tr>
          <td>NSX Host Overlay VLAN</td>
          <td style="text-align: center">✓</td>
          <td style="text-align: center">X</td>
          <td style="text-align: center">✓</td>
          <td style="text-align: center">9000</td>
      </tr>
      <tr>
          <td>NSX Edge Uplink01 VLAN</td>
          <td style="text-align: center">✓</td>
          <td style="text-align: center">✓</td>
          <td style="text-align: center">X</td>
          <td style="text-align: center">9000</td>
      </tr>
      <tr>
          <td>NSX Edge Uplink02 VLAN</td>
          <td style="text-align: center">✓</td>
          <td style="text-align: center">✓</td>
          <td style="text-align: center">X</td>
          <td style="text-align: center">9000</td>
      </tr>
      <tr>
          <td>NSX Edge Overlay VLAN</td>
          <td style="text-align: center">✓</td>
          <td style="text-align: center">✓</td>
          <td style="text-align: center">✓</td>
          <td style="text-align: center">9000</td>
      </tr>
      <tr>
          <td>Management VLAN (AZ2)</td>
          <td style="text-align: center">X</td>
          <td style="text-align: center">✓</td>
          <td style="text-align: center">✓</td>
          <td style="text-align: center">1500</td>
      </tr>
      <tr>
          <td>vMotion VLAN (AZ2)</td>
          <td style="text-align: center">X</td>
          <td style="text-align: center">✓</td>
          <td style="text-align: center">✓</td>
          <td style="text-align: center">9000</td>
      </tr>
      <tr>
          <td>vSAN VLAN (AZ2)</td>
          <td style="text-align: center">X</td>
          <td style="text-align: center">✓</td>
          <td style="text-align: center">✓</td>
          <td style="text-align: center">9000</td>
      </tr>
      <tr>
          <td>NSX Host Overlay VLAN (AZ2)</td>
          <td style="text-align: center">X</td>
          <td style="text-align: center">✓</td>
          <td style="text-align: center">✓</td>
          <td style="text-align: center">9000</td>
      </tr>
  </tbody>
</table>
<p>In your Lab setup you can also use 1700 byte MTU instead of 9000 byte mtu.
Redundant gateways are heavily dependent on the physical network design.
For a spine/leaf network with top of rack switches, you can use VRRP, HSRP or anycast gateway technology, depending on the manufacturer, expected traffic, fabric design and so on.
It depends heavily on the underlay design which is the right one here. In my setup, I have implemented the gateways on my top of rack switch.</p>
<h2 id="deploy-vsan-witness-host">Deploy vSAN Witness Host</h2>
<p>A vSAN Witness Host is required for a Stretched vSAN Cluster. Ready-made appliances can be downloaded from the Broadcom support portal. These are delivered as OVA and must be deployed on a third independent site. vSAN Witness can be connected either Layer 2 or Layer 3.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Best Practice</b>
        </div>
        <div class="admonition-content">The recommended way would be a layer 3 connection and each site has an independent routed connection to the Witness Appliance.</div>
    </aside>
<p>After the successful deployment, the witness host must be registered in the vCenter of the management domain. You must add the vSAN witness host to the datacenter. Do not add it to a folder. Use the fully qualified domain name (FQDN) of the vSAN witness host, not the IP address.
There are a few adjustments to be made to the witness host:</p>
<ul>
<li>Remove the dedicated VMkernel adapter for witness traffic on the vSAN witness host.</li>
<li>Remove the network port group from the virtual machine on the vSAN witness host.</li>
<li>Enable witness traffic on the VMkernel adapter for the management network of the vSAN witness host.</li>
</ul>
<p>A step-by-step guide can be found in the <a href="https://techdocs.broadcom.com/us/en/vmware-cis/vcf/vcf-5-2-and-earlier/5-2/map-for-administering-vcf-5-2/stretching-clusters-admin/deploy-vsan-witness-host-admin/configure-the-vmkernel-adapters-on-the-vsan-witness-host-admin.html">VCF 5.2 Administration Guide</a>.</p>
<h2 id="commission-host">Commission Host</h2>
<p>As I already wrote in the intro, I assume for the article that a management domain is already deployed, but that it is not yet a stretched cluster.
The physical network is configured and all VLANs and IP networks are available.
Another network pool is also required. The vSAN and vMotion network is defined in this network pool.</p>
<figure><a href="02.png"><picture><source srcset="/vcf-stretch/02_hu_fc62138f57b4e37e.png" type="image/png">
          <img
            src="/vcf-stretch/02_hu_fc62138f57b4e37e.png"alt="Network Pool"width="1251"
            height="875"/>
        </picture></a><figcaption><p>Network Pool (click to enlarge)</p></figcaption></figure>
<p>Now that the preparations are complete, I can add the additional hosts in the sddc manager.
These must have the same build as the existing hosts in the cluster.
If, for example, the MGMT domain has been patched with ESXi patches to a different build than is in the BoM of VCF 5.2.1, then the additional hosts must be updated to the same version before they are provisioned.</p>
<figure><a href="01.png"><picture><source srcset="/vcf-stretch/01_hu_60316a2c1ca53db7.png" type="image/png">
          <img
            src="/vcf-stretch/01_hu_60316a2c1ca53db7.png"alt="SDDC Manager"width="1496"
            height="891"/>
        </picture></a><figcaption><p>SDDC Manager (click to enlarge)</p></figcaption></figure>
<p>After the hosts have been successfully put into operation in the SDDC Manager, they must not be added to the management domain, as stretching the cluster is not possible via the GUI.
Next, the cluster stretch spec must be created – this is the fun part of the deployment.</p>
<h2 id="stretch-a-vsan-cluster-aka-fun-part">Stretch a vSAN Cluster (aka Fun Part)</h2>
<p>Unfortunately, the actual stretch of the cluster cannot be conveniently carried out in the SDDC GUI. A JSON cluster stretch spec must be created and then passed to the SDDC manager via an API call.</p>

    <aside class="admonition attention">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" class="feather feather-link" width="24" height="24" viewBox="0 0 24 24"
      fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
      <path d="M10 13a5 5 0 0 0 7.54.54l3-3a5 5 0 0 0-7.07-7.07l-1.72 1.71"></path>
      <path d="M14 11a5 5 0 0 0-7.54-.54l-3 3a5 5 0 0 0 7.07 7.07l1.71-1.71"></path>
   </svg></div><b>Attention</b>
        </div>
        <div class="admonition-content">The ESXi hosts that you are adding to availability zone 2 must use the same vmnic to vSphere Distributed Switch mapping as the existing hosts in availability zone 1.</div>
    </aside>
<p>The ESXi host ID is required for the JSON. This can be read out via an API call. The easiest way to do this is to use the Developer Center in the SDDC. Of course, any API tool will also work. At this point, I was just too lazy to generate an API token and did all the queries in the developer center of the SDDC manager.</p>
<figure><a href="03.png"><picture><source srcset="/vcf-stretch/03_hu_76e7587064ca4935.png" type="image/png">
          <img
            src="/vcf-stretch/03_hu_76e7587064ca4935.png"alt="Developer Center"width="1112"
            height="798"/>
        </picture></a><figcaption><p>Developer Center (click to enlarge)</p></figcaption></figure>
<p>The answer contains the IDs of the ESX servers, which can be easily transferred to the cluster stretch spec. My cluster stretch spec looks like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-json" data-lang="json"><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl"> <span class="nt">&#34;clusterStretchSpec&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;hostSpecs&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">   <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;hostname&#34;</span><span class="p">:</span> <span class="s2">&#34;vcf02-m01-esx05.lab.home&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;hostNetworkSpec&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">     <span class="nt">&#34;networkProfileName&#34;</span><span class="p">:</span> <span class="s2">&#34;sfo-w01-az2-nsx-np01&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">     <span class="nt">&#34;vmNics&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">      <span class="p">{</span>
</span></span><span class="line"><span class="cl">       <span class="nt">&#34;id&#34;</span><span class="p">:</span> <span class="s2">&#34;vmnic0&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">       <span class="nt">&#34;uplink&#34;</span><span class="p">:</span> <span class="s2">&#34;uplink1&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">       <span class="nt">&#34;vdsName&#34;</span><span class="p">:</span> <span class="s2">&#34;sfo-m01-vds1&#34;</span>
</span></span><span class="line"><span class="cl">      <span class="p">},</span>
</span></span><span class="line"><span class="cl">      <span class="p">{</span>
</span></span><span class="line"><span class="cl">       <span class="nt">&#34;id&#34;</span><span class="p">:</span> <span class="s2">&#34;vmnic1&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">       <span class="nt">&#34;uplink&#34;</span><span class="p">:</span> <span class="s2">&#34;uplink2&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">       <span class="nt">&#34;vdsName&#34;</span><span class="p">:</span> <span class="s2">&#34;sfo-m01-vds1&#34;</span>
</span></span><span class="line"><span class="cl">      <span class="p">}</span>
</span></span><span class="line"><span class="cl">     <span class="p">]</span>
</span></span><span class="line"><span class="cl">    <span class="p">},</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;id&#34;</span><span class="p">:</span> <span class="s2">&#34;2bb45762-b2b9-4d94-8003-980f32d449f2&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;licenseKey&#34;</span><span class="p">:</span> <span class="s2">&#34;XXXX-XXXXX-XXXXX-XXXXX-XXXXX&#34;</span>
</span></span><span class="line"><span class="cl">   <span class="p">},</span>
</span></span><span class="line"><span class="cl">   <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;hostname&#34;</span><span class="p">:</span> <span class="s2">&#34;vcf02-m01-esx06.lab.home&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;hostNetworkSpec&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">     <span class="nt">&#34;networkProfileName&#34;</span><span class="p">:</span> <span class="s2">&#34;sfo-w01-az2-nsx-np01&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">     <span class="nt">&#34;vmNics&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">      <span class="p">{</span>
</span></span><span class="line"><span class="cl">       <span class="nt">&#34;id&#34;</span><span class="p">:</span> <span class="s2">&#34;vmnic0&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">       <span class="nt">&#34;uplink&#34;</span><span class="p">:</span> <span class="s2">&#34;uplink1&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">       <span class="nt">&#34;vdsName&#34;</span><span class="p">:</span> <span class="s2">&#34;sfo-m01-vds1&#34;</span>
</span></span><span class="line"><span class="cl">      <span class="p">},</span>
</span></span><span class="line"><span class="cl">      <span class="p">{</span>
</span></span><span class="line"><span class="cl">       <span class="nt">&#34;id&#34;</span><span class="p">:</span> <span class="s2">&#34;vmnic1&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">       <span class="nt">&#34;uplink&#34;</span><span class="p">:</span> <span class="s2">&#34;uplink2&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">       <span class="nt">&#34;vdsName&#34;</span><span class="p">:</span> <span class="s2">&#34;sfo-m01-vds1&#34;</span>
</span></span><span class="line"><span class="cl">      <span class="p">}</span>
</span></span><span class="line"><span class="cl">     <span class="p">]</span>
</span></span><span class="line"><span class="cl">    <span class="p">},</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;id&#34;</span><span class="p">:</span> <span class="s2">&#34;a1fafaf3-bb88-4ee6-9a1b-a7ca0e8e9c47&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;licenseKey&#34;</span><span class="p">:</span> <span class="s2">&#34;XXXX-XXXXX-XXXXX-XXXXX-XXXXX&#34;</span>
</span></span><span class="line"><span class="cl">   <span class="p">},</span>
</span></span><span class="line"><span class="cl">   <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;hostname&#34;</span><span class="p">:</span> <span class="s2">&#34;vcf02-m01-esx07.lab.home&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;hostNetworkSpec&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">     <span class="nt">&#34;networkProfileName&#34;</span><span class="p">:</span> <span class="s2">&#34;sfo-w01-az2-nsx-np01&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">     <span class="nt">&#34;vmNics&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">      <span class="p">{</span>
</span></span><span class="line"><span class="cl">       <span class="nt">&#34;id&#34;</span><span class="p">:</span> <span class="s2">&#34;vmnic0&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">       <span class="nt">&#34;uplink&#34;</span><span class="p">:</span> <span class="s2">&#34;uplink1&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">       <span class="nt">&#34;vdsName&#34;</span><span class="p">:</span> <span class="s2">&#34;sfo-m01-vds1&#34;</span>
</span></span><span class="line"><span class="cl">      <span class="p">},</span>
</span></span><span class="line"><span class="cl">      <span class="p">{</span>
</span></span><span class="line"><span class="cl">       <span class="nt">&#34;id&#34;</span><span class="p">:</span> <span class="s2">&#34;vmnic1&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">       <span class="nt">&#34;uplink&#34;</span><span class="p">:</span> <span class="s2">&#34;uplink2&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">       <span class="nt">&#34;vdsName&#34;</span><span class="p">:</span> <span class="s2">&#34;sfo-m01-vds1&#34;</span>
</span></span><span class="line"><span class="cl">      <span class="p">}</span>
</span></span><span class="line"><span class="cl">     <span class="p">]</span>
</span></span><span class="line"><span class="cl">    <span class="p">},</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;id&#34;</span><span class="p">:</span> <span class="s2">&#34;a8ece8a4-9264-4b05-8637-02cbc0a85f45&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;licenseKey&#34;</span><span class="p">:</span> <span class="s2">&#34;XXXX-XXXXX-XXXXX-XXXXX-XXXXX&#34;</span>
</span></span><span class="line"><span class="cl">   <span class="p">},</span>
</span></span><span class="line"><span class="cl">   <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;hostname&#34;</span><span class="p">:</span> <span class="s2">&#34;vcf02-m01-esx08.lab.home&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;hostNetworkSpec&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">     <span class="nt">&#34;networkProfileName&#34;</span><span class="p">:</span> <span class="s2">&#34;sfo-w01-az2-nsx-np01&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">     <span class="nt">&#34;vmNics&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">      <span class="p">{</span>
</span></span><span class="line"><span class="cl">       <span class="nt">&#34;id&#34;</span><span class="p">:</span> <span class="s2">&#34;vmnic0&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">       <span class="nt">&#34;uplink&#34;</span><span class="p">:</span> <span class="s2">&#34;uplink1&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">       <span class="nt">&#34;vdsName&#34;</span><span class="p">:</span> <span class="s2">&#34;sfo-m01-vds1&#34;</span>
</span></span><span class="line"><span class="cl">      <span class="p">},</span>
</span></span><span class="line"><span class="cl">      <span class="p">{</span>
</span></span><span class="line"><span class="cl">       <span class="nt">&#34;id&#34;</span><span class="p">:</span> <span class="s2">&#34;vmnic1&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">       <span class="nt">&#34;uplink&#34;</span><span class="p">:</span> <span class="s2">&#34;uplink2&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">       <span class="nt">&#34;vdsName&#34;</span><span class="p">:</span> <span class="s2">&#34;sfo-m01-vds1&#34;</span>
</span></span><span class="line"><span class="cl">      <span class="p">}</span>
</span></span><span class="line"><span class="cl">     <span class="p">]</span>
</span></span><span class="line"><span class="cl">    <span class="p">},</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;id&#34;</span><span class="p">:</span> <span class="s2">&#34;c4e8e06f-5229-4d3c-86a3-a9f0f76fdea2&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;licenseKey&#34;</span><span class="p">:</span> <span class="s2">&#34;XXXX-XXXXX-XXXXX-XXXXX-XXXXX&#34;</span>
</span></span><span class="line"><span class="cl">   <span class="p">}</span>
</span></span><span class="line"><span class="cl">  <span class="p">],</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;isEdgeClusterConfiguredForMultiAZ&#34;</span><span class="p">:</span> <span class="kc">false</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;networkSpec&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="nt">&#34;networkProfiles&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">     <span class="nt">&#34;isDefault&#34;</span><span class="p">:</span> <span class="kc">false</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">     <span class="nt">&#34;name&#34;</span><span class="p">:</span> <span class="s2">&#34;sfo-w01-az2-nsx-np01&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">     <span class="nt">&#34;nsxtHostSwitchConfigs&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">      <span class="p">{</span>
</span></span><span class="line"><span class="cl">       <span class="nt">&#34;uplinkProfileName&#34;</span><span class="p">:</span> <span class="s2">&#34;sfo-w01-az2-host-uplink-profile01&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">       <span class="nt">&#34;vdsName&#34;</span><span class="p">:</span> <span class="s2">&#34;sfo-m01-vds1&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">       <span class="nt">&#34;vdsUplinkToNsxUplink&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">         <span class="nt">&#34;nsxUplinkName&#34;</span><span class="p">:</span> <span class="s2">&#34;uplink-1&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">         <span class="nt">&#34;vdsUplinkName&#34;</span><span class="p">:</span> <span class="s2">&#34;uplink1&#34;</span>
</span></span><span class="line"><span class="cl">        <span class="p">},</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">         <span class="nt">&#34;nsxUplinkName&#34;</span><span class="p">:</span> <span class="s2">&#34;uplink-2&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">         <span class="nt">&#34;vdsUplinkName&#34;</span><span class="p">:</span> <span class="s2">&#34;uplink2&#34;</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">       <span class="p">]</span>
</span></span><span class="line"><span class="cl">      <span class="p">}</span>
</span></span><span class="line"><span class="cl">     <span class="p">]</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">   <span class="p">],</span>
</span></span><span class="line"><span class="cl">   <span class="nt">&#34;nsxClusterSpec&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;uplinkProfiles&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">     <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;name&#34;</span><span class="p">:</span> <span class="s2">&#34;sfo-w01-az2-host-uplink-profile01&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;teamings&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">       <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;activeUplinks&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">         <span class="s2">&#34;uplink-1&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">         <span class="s2">&#34;uplink-2&#34;</span>
</span></span><span class="line"><span class="cl">        <span class="p">],</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;name&#34;</span><span class="p">:</span> <span class="s2">&#34;DEFAULT&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;policy&#34;</span><span class="p">:</span> <span class="s2">&#34;LOADBALANCE_SRCID&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;standByUplinks&#34;</span><span class="p">:</span> <span class="p">[]</span>
</span></span><span class="line"><span class="cl">       <span class="p">}</span>
</span></span><span class="line"><span class="cl">      <span class="p">],</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;transportVlan&#34;</span><span class="p">:</span> <span class="mi">1011</span>
</span></span><span class="line"><span class="cl">     <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="p">]</span>
</span></span><span class="line"><span class="cl">   <span class="p">}</span>
</span></span><span class="line"><span class="cl">  <span class="p">},</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;witnessSpec&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="nt">&#34;fqdn&#34;</span><span class="p">:</span> <span class="s2">&#34;vcf02-osa.lab.home&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">   <span class="nt">&#34;vsanCidr&#34;</span><span class="p">:</span> <span class="s2">&#34;192.168.12.0/24&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">   <span class="nt">&#34;vsanIp&#34;</span><span class="p">:</span> <span class="s2">&#34;192.168.12.22&#34;</span>
</span></span><span class="line"><span class="cl">  <span class="p">},</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;witnessTrafficSharedWithVsanTraffic&#34;</span><span class="p">:</span> <span class="kc">false</span>
</span></span><span class="line"><span class="cl"> <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></div><p>Next, the cluster stretch spec can be validated.The cluster ID is required for this.</p>
<p>This is also done as an API call. To do this, I go to the Developer Center, then to the managing clusters section, expand <em><strong>GET /v1/cluster</strong></em> end execute the API call.</p>
<p>Here is a small excerpt of what the response looks like. The correct ID can be found in the “id” field at the top level, in my case its “e04c96fe-263c-4355-bdcf-7464b03e35d2”. The domain ID must not be used.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-json" data-lang="json"><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;elements&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="nt">&#34;id&#34;</span><span class="p">:</span> <span class="s2">&#34;e04c96fe-263c-4355-bdcf-7464b03e35d2&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="nt">&#34;domain&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">                <span class="nt">&#34;id&#34;</span><span class="p">:</span> <span class="s2">&#34;40a61f61-0686-4514-b6a9-ba47c8533be9&#34;</span>
</span></span><span class="line"><span class="cl">            <span class="p">},</span>
</span></span><span class="line"><span class="cl">            <span class="nt">&#34;name&#34;</span><span class="p">:</span> <span class="s2">&#34;sfo-m01-cluster-001&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="nt">&#34;status&#34;</span><span class="p">:</span> <span class="s2">&#34;ACTIVE&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="nt">&#34;primaryDatastoreName&#34;</span><span class="p">:</span> <span class="s2">&#34;m01-cluster-001-vsan&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="nt">&#34;primaryDatastoreType&#34;</span><span class="p">:</span> <span class="s2">&#34;VSAN&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="nt">&#34;hosts&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">                <span class="p">{</span>
</span></span><span class="line"><span class="cl">                    <span class="nt">&#34;id&#34;</span><span class="p">:</span> <span class="s2">&#34;034f44ca-1654-450c-915f-6d3e072bee68&#34;</span>
</span></span><span class="line"><span class="cl">                <span class="p">},</span>
</span></span><span class="line"><span class="cl">                <span class="p">{</span>
</span></span><span class="line"><span class="cl">                    <span class="nt">&#34;id&#34;</span><span class="p">:</span> <span class="s2">&#34;2bb45762-b2b9-4d94-8003-980f32d449f2&#34;</span>
</span></span><span class="line"><span class="cl">                <span class="p">},</span>
</span></span><span class="line"><span class="cl">                <span class="p">{</span>
</span></span><span class="line"><span class="cl">                    <span class="nt">&#34;id&#34;</span><span class="p">:</span> <span class="s2">&#34;7cdd8928-575d-44b0-a604-834d3dd0bf13&#34;</span>
</span></span><span class="line"><span class="cl">                <span class="p">},</span>
</span></span><span class="line"><span class="cl">                <span class="p">{</span>
</span></span><span class="line"><span class="cl">                    <span class="nt">&#34;id&#34;</span><span class="p">:</span> <span class="s2">&#34;9cbbe32f-31ce-450a-863c-31c39e248ab0&#34;</span>
</span></span><span class="line"><span class="cl">                <span class="p">}</span>
</span></span></code></pre></div><p>To perform the validate cluster stretch spec, I&rsquo;ll expand the “APIs for managing clusters” section and expand <em><strong>POST /v1/clusters/{id}/validations</strong></em>.
I enter the unique ID for the management cluster in the id text field.
In addition, the cluster stretch spec must be copied into the body field.</p>
<figure><a href="05.png"><picture><source srcset="/vcf-stretch/05_hu_214227fba3f19feb.png" type="image/png">
          <img
            src="/vcf-stretch/05_hu_214227fba3f19feb.png"alt="Validation"width="1094"
            height="867"/>
        </picture></a><figcaption><p>cluster stretch spec Validation (click to enlarge)</p></figcaption></figure>
<p>After the validation has been successfully completed, the cluster can be stretched.
To start the actual task, another API call must be made. This time <em><strong>PATCH /v1/clusters/{id}</strong></em>. The process is the same as for the validation. I copy the cluster ID into the ID field and copy the cluster stretch spec into the body field. After executing, the cluster stretch begins. This can be followed in the SDDC in the task bar.</p>
<figure><a href="04.png"><picture><source srcset="/vcf-stretch/04_hu_e81f538bbc49fa08.png" type="image/png">
          <img
            src="/vcf-stretch/04_hu_e81f538bbc49fa08.png"alt="Cluster Stretch"width="1720"
            height="1200"/>
        </picture></a><figcaption><p>Cluster Stretch (click to enlarge)</p></figcaption></figure>
<p>Depending on the performance of the environment, the process may take some time. For me, it was comparable to the duration of the bring-up of the unstretched management domain. If the task has been completed successfully, the SDDC and the vCenter should look like the following screenshots.</p>
<p><figure><a href="06.png"><picture><source srcset="/vcf-stretch/06_hu_435487497bd721f5.png" type="image/png">
          <img
            src="/vcf-stretch/06_hu_435487497bd721f5.png"alt="Stretched Cluster SDDC"width="1487"
            height="683"/>
        </picture></a><figcaption><p>Stretched Cluster SDDC (click to enlarge)</p></figcaption></figure>
<figure><a href="07.png"><picture><source srcset="/vcf-stretch/07_hu_a00d36f09c5e7264.png" type="image/png">
          <img
            src="/vcf-stretch/07_hu_a00d36f09c5e7264.png"alt="Stretched Cluster vCenter"width="1716"
            height="1084"/>
        </picture></a><figcaption><p>Stretched Cluster vCenter (click to enlarge)</p></figcaption></figure></p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Storage Policy</b>
        </div>
        <div class="admonition-content">When you stretch a cluster, VMware Cloud Foundation modifies the site disaster tolerance setting for storage policy associated with datastore of that cluster from None - standard cluster to Site mirroring - stretched cluster. This affects all VMs using default datastore policy in that cluster.</div>
    </aside>
<h2 id="notes">Notes</h2>
<p>Further stretched cluster maintenence operations like upgrades can be carried out via GUI. A cluster shrink, expand or a cluster unstretch must currently still be carried out via API.</p>
<p>You also need to think about BGP peering. Depending on the setup, each side has its own AS number for BGP and Broadcom recommends working with route maps and local preference to control traffic.
Personally, I would say: it depends. However, this topic is so extensive that it may get its own blog article at some point.
In addition, you have to worry about the placement of VMs, storage policies and much more&hellip;</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Opinion</b>
        </div>
        <div class="admonition-content">Design is always a matter of opinions and there is usually not just one perfect solution.</div>
    </aside>
<h2 id="lab-hardware-and-sizing-of-the-vms">Lab Hardware and Sizing of the VMs</h2>
<p>Perhaps a brief word about the resources used. I could have set up the lab more economically.
There is potential for savings in the NSX managers, where only one manager would have been needed with some adjustments or non nested edge vms.
But I didn&rsquo;t want anything unexpected to happen to me during the deployment, so I decided to build the whole thing as close to reality as possible.
My next attempt will be to build the setup as resource-efficient as possible.</p>
<p>As a nested ESX server, I used the <a href="https://williamlam.com/nested-virtualization/nested-esxi-virtual-appliance">Nested ESXi Virtual Appliance from William Lam</a> and customized it for my use case.
Each of the 8 ESXi virtual servers has 10 cores, 40GB of RAM and 3 hard disks. 16GB for the OS, 40 GB as an OSA cache tier and 400 GB as an OSA capacity tier. I use 2 network cards.</p>
<p>The virtual ESX servers run on 4 Minisforum MS-01 with 2x10 Gb network and 96 GB Ram each. I have 2 virtual servers running on each MS-01. The VMs run on local NVMe storage so that they are not unnecessarily moved via DRS.
I have a nested ESXi server of AZ 1 and AZ 2 running on an MS-01. The vSAN Witness Appliance runs on my management ESX server.</p>
<figure><a href="08.png"><picture><source srcset="/vcf-stretch/08_hu_4939e63a0da844ba.png" type="image/png">
          <img
            src="/vcf-stretch/08_hu_4939e63a0da844ba.png"alt="Lab Usage"width="552"
            height="373"/>
        </picture></a><figcaption><p>Lab usage (click to enlarge)</p></figcaption></figure>
<p>After the lab has booted and all services are running as they should, the resource requirements calm down a bit. I still have 200GB of RAM free on my physical hosts, so there is room for expansion and further tests.</p>
<h2 id="conclusion">Conclusion</h2>
<p>Deploying a stretched cluster is not that difficult if you follow all the preparations and meet all the requirements. The real challenge lies elsewhere, namely in the design of the actual stretched cluster.</p>
]]></content>
		</item>
		
		<item>
			<title>VCF Import Tool - Enable Overlay in an imported VCF Domain</title>
			<link>https://sdn-warrior.org/posts/vcf-import-overlay/</link>
			<pubDate>Wed, 05 Mar 2025 21:00:00 +0100</pubDate>
			
			<guid>https://sdn-warrior.org/posts/vcf-import-overlay/</guid>
			<description><![CDATA[Part 2 VCF Import Cluster with NFS and activating the overlay.]]></description>
			<content type="html"><![CDATA[<h2 id="introduction">Introduction</h2>
<p>My blog is a follow-up to my article <a href="https://sdn-warrior.org/posts/vcf-import-cluster/">&ldquo;VCF Import Tool - Run VCF with NFS as principal Storage&rdquo;</a> and covers the activation of the overlay network after I have successfully converted an ESXi cluster to a VCF management domain using the VCF Import Tool.</p>
<p>The VCF import tool still has a few limitations. Among other things, no NSX TEP interface is configured in VCF 5.2.1 after the convert or import. Without the tunnel endpoints it is not possible to use NSX overlay networks. In this blog I will show which steps you have to take to prepare NSX so that we can create and use overlay networks.</p>
<h2 id="creating-an-ip-pool">Creating an IP pool</h2>
<p>For our TEP network we can either use a DHCP server or the IP pool variant preferred by VMware. I personally find the DHCP server more flexible and easier for environments that need to grow quickly, but there is no right or wrong at this point. However, when creating the pool, you should make sure that our subnet has a sufficient size, as this can only be changed to a limited extent after successful allocation.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Best practice</b>
        </div>
        <div class="admonition-content">VMware recommends the use of /24 networks in the current NSX Reference Design Guide. This would also be my recommendation, as you need 2 IP addresses per transport node for a dual TEP setup and you want to be prepared for the future. Outside of VCF, setups with 4 TEP IP addresses per transport node are also possible and you never know what will happen in future.</div>
    </aside>
<p>To create an IP pool, we go to <strong>IP Address Pools</strong> under <strong>Networking</strong> in the NSX Gui and click the <strong>ADD IP ADDRESS POOL</strong> button.</p>
<p>Under Subnet we click on the IP RANGES button and configure our IP range, our network CIDR and a gateway. The TEP network must be routed, as the Edge Transport Nodes normally have their TEP IP addresses in a different network and VLAN. While it is possible for the Edge Transport Nodes to be on the same VLAN and network as our Host Transport Nodes, this is a less common scenario and requires the Edge Transport Nodes to either run on non-NSX enabled hosts or use NSX backed VLAN segments. I have already writte a <a href="https://evoila.com/blog/nsx-sharing-overlay-transport-vlan-between-esxi-teps-edge-teps/">blog</a> for Evoila about this.</p>
<figure><a href="01.png"><picture><source srcset="/vcf-import-overlay/01_hu_f02e8477ba7ed9c6.png" type="image/png">
          <img
            src="/vcf-import-overlay/01_hu_f02e8477ba7ed9c6.png"alt="IP Pool"width="2302"
            height="1434"/>
        </picture></a><figcaption><p>IP Pool configuration (click to enlarge)</p></figcaption></figure>
<p>Finally, we define a name for the IP pool and save the configuration.</p>
<h2 id="update-the-default-transport-zone">Update the default Transport Zone</h2>
<p>In NSX, a transport zone (TZ for short) is a logical grouping of transport nodes that defines which network segments can exist on these nodes.
It determines the scope of the overlay and VLAN networks and is not a security feature.
There are two types of TZ: an overlay and a VLAN transport zone. For the imported or converted VCF environment, we only need to make changes to the overlay TZ. The VLAN TZ is created when the Edge Nodes are deployed via the SDDC Manager and no longer needs to be customized.
VCF uses the default transport zone <em><strong>nsx-overlay-transportzone</strong></em>.
This must also be assigned later in the <em><strong>transport node profile</strong></em>.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info Overlay TZ</b>
        </div>
        <div class="admonition-content">VCF 5.2.1 currently only supports one overlay transport zone. It is not possible to use multiple overlay transport zones. This can be a crucial point in the NSX design. If I don&rsquo;t want the overlay networks in my workload domains to be visible everywhere, I have to deploy multiple workload domains with their own NSX instance.</div>
    </aside>
<p>We first need to add two tags to the TZ. These tags are used internally by the SDDC manager to identify the TZ as being used by VCF.</p>

    <aside class="admonition alert">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" class="feather feather-link" width="24" height="24" viewBox="0 0 24 24"
      fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
      <path d="M10 13a5 5 0 0 0 7.54.54l3-3a5 5 0 0 0-7.07-7.07l-1.72 1.71"></path>
      <path d="M14 11a5 5 0 0 0-7.54-.54l-3 3a5 5 0 0 0 7.07 7.07l1.71-1.71"></path>
   </svg></div><b>TAGs</b>
        </div>
        <div class="admonition-content">Note that the tag consists of two parts, the tag name and the tag scope. Also, tags are case sensitive, so you need to enter them exactly as shown.</div>
    </aside>
<p>The transport zones can be found in the NSX GUI under <em><strong>System</strong></em> - <em><strong>Transport Zone</strong></em>. There should be exactly one overlay transport zone that also has the default flag.
The following TAGs must be attached to the overlay transport zone:</p>
<table>
  <thead>
      <tr>
          <th>Tag Name</th>
          <th>Scope</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>VCF</td>
          <td>Created by</td>
      </tr>
      <tr>
          <td>vcf</td>
          <td>vcf-orchestration</td>
      </tr>
  </tbody>
</table>
<p>The transport zone type cannot be changed afterwards. We can and must only add tags here.</p>
<figure><a href="02.png"><picture><source srcset="/vcf-import-overlay/02_hu_b57b5be6cd8eca87.png" type="image/png">
          <img
            src="/vcf-import-overlay/02_hu_b57b5be6cd8eca87.png"alt="Transport Zone"width="2870"
            height="980"/>
        </picture></a><figcaption><p>Transport Zone configuration (click to enlarge)</p></figcaption></figure>
<h2 id="edit-the-transport-node-profiles">Edit the Transport Node Profiles</h2>
<p>The next step is to customize the transport node profile so that we can use overlay networks and create NSX TEP interfaces on our ESXi servers.</p>
<p>Since NSX 3.1.x, the Transport Node profile can be found under <em><strong>Fabric</strong></em> - <em><strong>Hosts</strong></em> - <em><strong>Transport Node Profile</strong></em>. The VCF Import Tool has already created a profile for us. As is so often the case with VCF, the profile has a cryptic name that consists of an ID + domain and the text autoconf-tnp. You can safely ignore the name. We need to go to the profile&rsquo;s host switch config and add the overlay transport zone to the profile. As soon as we have added this TZ, additional options for IP assignment will appear. The previously created IP pool is selected here. We save the whole thing and that&rsquo;s it for the changes in the NSX.</p>
<figure><a href="03.png"><picture><source srcset="/vcf-import-overlay/03_hu_c60198b4f4daa890.png" type="image/png">
          <img
            src="/vcf-import-overlay/03_hu_c60198b4f4daa890.png"alt="TNP"width="2308"
            height="1708"/>
        </picture></a><figcaption><p>Transport Node Profile configuration (click to enlarge)</p></figcaption></figure>
<p>After everything has been saved, NSX starts customizing the cluster and configuring TEP IP addresses for each of our ESXi servers.</p>
<h2 id="verify-tep-network">Verify TEP Network</h2>
<p>There are several ways to check the TEP network.
We can look it up in the vCenter, where each host should now have 2 VMK adapters with IP addresses from our TEP network.
VKM10 and VMK11 are responsible for processing our TEP traffic in the dual TEP setup. The second option is in NSX itself.
To do this, go to <em><strong>System</strong></em> - <em><strong>Fabric</strong></em> - <em><strong>Hosts</strong></em> and look at our cluster.
In the TEP IP Addresses column, each ESXi in the cluster should have two IP addresses.</p>
<figure><a href="04.png"><picture><source srcset="/vcf-import-overlay/04_hu_1f3317844a57c088.png" type="image/png">
          <img
            src="/vcf-import-overlay/04_hu_1f3317844a57c088.png"alt="TEP IP"width="2888"
            height="1486"/>
        </picture></a><figcaption><p>Transport Node TEP IPs (click to enlarge)</p></figcaption></figure>
<p>The third option is to ping from one ESXi server to another. To do this, the VMK ping command must be executed as follows:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl"><span class="o">[</span>root@vcf03-m01-esx01:~<span class="o">]</span> vmkping -I vmk10 -S vxlan 10.28.4.105 -d -s <span class="m">1972</span>
</span></span><span class="line"><span class="cl">PING 10.28.4.105 <span class="o">(</span>10.28.4.105<span class="o">)</span>: <span class="m">1972</span> data bytes
</span></span><span class="line"><span class="cl"><span class="m">1980</span> bytes from 10.28.4.105: <span class="nv">icmp_seq</span><span class="o">=</span><span class="m">0</span> <span class="nv">ttl</span><span class="o">=</span><span class="m">64</span> <span class="nv">time</span><span class="o">=</span>0.274 ms
</span></span><span class="line"><span class="cl"><span class="m">1980</span> bytes from 10.28.4.105: <span class="nv">icmp_seq</span><span class="o">=</span><span class="m">1</span> <span class="nv">ttl</span><span class="o">=</span><span class="m">64</span> <span class="nv">time</span><span class="o">=</span>0.113 ms
</span></span><span class="line"><span class="cl"><span class="m">1980</span> bytes from 10.28.4.105: <span class="nv">icmp_seq</span><span class="o">=</span><span class="m">2</span> <span class="nv">ttl</span><span class="o">=</span><span class="m">64</span> <span class="nv">time</span><span class="o">=</span>0.267 ms
</span></span></code></pre></div><p>I use an MTU of 2000 in my lab, so the maximum payload is 1972 bytes.<br>
<code>-d</code> sets the <strong>don&rsquo;t fragment</strong> bit, and <code>-S vxlan</code> uses the <strong>VXLAN socket interface</strong> for the ping.
This is used to test the transport over the <strong>NSX Geneve overlay</strong>.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info MTU</b>
        </div>
        <div class="admonition-content">NSX 4.X uses a minimum MTU of 1700 bytes. Normally, I also stick to this size in my labs, but I made a typing error when creating the AVN and I didn&rsquo;t want to do it all over again. Happy little excidents ;) In productive environments, an MTU size of 9000 is always preferable.</div>
    </aside>
<h2 id="update-the-sddc-manager">Update the SDDC Manager</h2>
<p>We need to use our VCF import tool again so that our SDDC manager also knows that we now have an overlay transport zone and that we can deploy our AVN for the Aria Suite, for example. To do this, we perform a sync and wait for it to complete.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">python3 vcf_brownfield.py sync --vcenter <span class="s1">&#39;vcf03-vcsa.lab.home&#39;</span> --sso-user <span class="s1">&#39;administrator@vsphere.local&#39;</span> --domain-name <span class="s1">&#39;mgmt&#39;</span>
</span></span></code></pre></div><p>Don&rsquo;t be surprised if you get an error after the sync. However, you can ignore it and it is totally logical that it occurs.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">Status: VALIDATION_FAILED
</span></span><span class="line"><span class="cl">  Check Name: vCenter Server no NSX Manager present
</span></span><span class="line"><span class="cl">  Description: Check that the vCenter Server does not have an NSX Manager connected to it
</span></span><span class="line"><span class="cl">  Details: Detected an NSX Manager connected to the vCenter Server
</span></span><span class="line"><span class="cl">  Remediation: Please ensure that no NSX Manager is connected to the vCenter Server to be imported
</span></span></code></pre></div><p>So our converted MGMT domain would be ready for use. Next, I deployed my Edge Cluster and then the Aria Suite. But that&rsquo;s not what this article is about.</p>
<h2 id="conclusion">Conclusion</h2>
<p>I really like the VCF Import tool, even though it still needs some fine-tuning here and there.
The sync function is quite practical and is a good way to bring changes made outside of the SDDC Manager back into the SDDC Manager.
Hopefully Broadcom will also provide the sync for a regularly deployed VCF, which from NSX&rsquo;s point of view would significantly enrich the VCF in terms of flexibility.</p>
]]></content>
		</item>
		
		<item>
			<title>VCF Import Tool - Run VCF with NFS as principal Storage </title>
			<link>https://sdn-warrior.org/posts/vcf-import-cluster/</link>
			<pubDate>Sun, 23 Feb 2025 16:00:00 +0100</pubDate>
			
			<guid>https://sdn-warrior.org/posts/vcf-import-cluster/</guid>
			<description><![CDATA[Importing from a vSphere cluster with NFS 3 as principal storage without losing support.]]></description>
			<content type="html"><![CDATA[<h2 id="introduction">Introduction</h2>
<p>VCF is a powerful platform designed to simplify the deployment of vSphere, NSX and the Aria product family.
This is both a blessing and a curse. On the one hand, the Cloudbuilder and SDDC Manager significantly simplify deployment, but this also takes away a certain amount of flexibility.
In addition, not every customer starts on a greenfield site.
Another point is the need for resources, which affects us home labbers in particular.
Many want to prepare for the new VCP exams and are now forced to deploy VCF in their Homelab.
There are already many solutions for this, such as the Holodeck, the <a href="https://github.com/lamw/vcf-automated-lab-deployment">Automated Lab Deployment Scrip</a>) or the <a href="https://github.com/lamw/vcf-automated-import-lab-deployment">Automated VMware Cloud Foundation Import Lab Deployment</a> script (both from the great <a href="https://williamlam.com/">William Lam</a>), but in my view this is too far removed from the kind of deployment that a customer will experience.
Therefore I will show several ways how to create VCF as close as possible to a real deployment without needing the real resources.
In this article I would like to show you how to do a convert with the VCF Import Tool. I will write more articles about VCF deployment in the future.</p>
<h2 id="why-use-the-import-tool">Why use the import tool?</h2>
<p>As indicated in the title, the import tool allows a VCF deployment with NFS as principal storage, which means that we do not need vSAN.
As much as I like vSAN, vSAN in Homelab always means additional resource consumption.
I need fast networking, local NVMes and the overall performance is still not great as I usually set up my LABs nested in my homelab.
In addition, the use of vSAN consumer NVMes wears out very quickly.
Thanks to VCF 5.2 and the VCF Import Tool, we can get around this relatively easily. Of course, there are always a few limitations.</p>
<table>
  <thead>
      <tr>
          <th style="text-align: center">Storage Type</th>
          <th style="text-align: center">Consolidated Workload Domain</th>
          <th style="text-align: center">Management Domain</th>
          <th style="text-align: center">VI Workload Domain</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: center"><strong>Principal</strong> Storage</td>
          <td style="text-align: center">No</td>
          <td style="text-align: center">Only for a management domain converted from vSphere infrastructure</td>
          <td style="text-align: center">Yes</td>
      </tr>
      <tr>
          <td style="text-align: center"><strong>Supplemental</strong> Storage</td>
          <td style="text-align: center">Yes</td>
          <td style="text-align: center">Yes</td>
          <td style="text-align: center">Yes</td>
      </tr>
  </tbody>
</table>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>NFS Version</b>
        </div>
        <div class="admonition-content">Only supports NFS protocol version 3 when used as principal storage. Supplemental storage can use either vSphere supported NFS protocol version 3 or 4.1. Although NFS 3 and NFS 4.1 can coexist on the same host, you cannot use different NFS versions to mount the same datastore on different hosts.</div>
    </aside>
<p>What can we take from the table now? It is possible to deploy a standard VCF design with NFS3 and without vSAN, and the best part is that the whole thing is officially supported and is not a homemade solution.</p>
<h2 id="deployment-and-preparation-of-the-management-domain">Deployment and preparation of the management domain</h2>
<p>First of all, we need to prepare our future management domain. To do this, I am deploying four virtual ESXi servers. I am using the Flings OVA from William Lam and adapting it for my lab.
My 4 ESX Nested Servers have the following hardware:</p>
<ul>
<li>10 vCPUs</li>
<li>45 GB RAM</li>
<li>16 GB Thin Storage</li>
<li>2 Network Cards</li>
</ul>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Virtual Hardware</b>
        </div>
        <div class="admonition-content">It is possible to carry out the deployment with fewer resources, but I had to deal with timeouts during the deployment and, based on my experience, these are the best sizing with my LAB hardware.</div>
    </aside>
<p>NTP service and DNS including reverse lookup must be available and working. In addition, all VLANs, gateways and, if necessary, firewall rules must be in place. To keep this article short, I won&rsquo;t go into this part in detail. The required networks can be found in the VCF Guide.</p>
<p>Of course, there are a few restrictions and things that we have to consider here as well.</p>
<ul>
<li>
<p>Cluster - Storage:</p>
<p>Default cluster must be one of vSAN, NFS v3, VMFS-FC, or VMFS-FCoE (only supported with VCF 5.2.1 and later).
NFS 4.1, VVOLs, and native iSCSI are not supported</p>
<p>Clusters cannot be stretched vSAN (and we will not be using vSAN either)</p>
<p>VCF 5.2: All clusters (vSAN, NFS v3, FC) must be 4 nodes minimum.</p>
<p>VCF 5.2.1.x When using NFS or FC and vLCM images, the default cluster must be 2 nodes minimum. When using NFS or FC and vLCM baselines, the default cluster must be 4 nodes minimum.</p>
</li>
<li>
<p>Cluster - Network:</p>
<p>vCenter Server must not have an existing NSX instance registered</p>
<p>LACP - VCF Import Tool 5.2.1.2 with SDDC Manager 5.2.1.1: Supported other Version are Not supported</p>
<p>Use vSphere Distributed Switches only. Standard or Cisco virtual switches are not supported.</p>
<p>VMkernel IP addresses must be statically assigned</p>
<p>Multiple VMkernels for a single traffic type (vSAN , vMotion) are not supported</p>
<p>VCF Import Tool 5.2.1.2 with SDDC Manager 5.2.1.1: ESXi hosts can have a different number of physical uplinks (minimum 2) assigned to a vSphere distributed switch. Each uplink must be a minimum of 10Gb.</p>
<p>Ealier versions: ESXi hosts must have the same number of physical uplinks (minimum 2) assigned to a vSphere distributed switch. Each uplink must be a minimum of 10Gb.</p>
<p>vSphere distributed switch teaming policies must match VMware Cloud Foundation standards</p>
<p>Dedicated vMotion network must be configured</p>
</li>
<li>
<p>Cluster - Compute:</p>
<p>Clusters must not be VxRail managed</p>
<p>Clusters that use vSphere Configuration Profiles are not supported</p>
<p>All clusters must be running vSphere 8.0U3 or later</p>
<p>DRS must be fully automated</p>
</li>
</ul>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Limitations</b>
        </div>
        <div class="admonition-content">There are more limitations, but these should be the most important ones.
10 Gb/s network is not necessary in a nested environment. The exact requirements can be found in the current <a href="https://techdocs.broadcom.com/us/en/vmware-cis/vcf/vcf-5-2-and-earlier/5-2/map-for-administering-vcf-5-2/importing-existing-vsphere-environments-admin/considerations-before-converting-or-importing-existing-vsphere-environments-into-vcf-admin.html">VCF 5.2 Administration Guide</a>.</div>
    </aside>
<p>So after I have prepared my future management domain, it should look similar to the screenshot below.</p>
<figure><a href="vcf01.png"><picture><source srcset="/vcf-import/vcf01_hu_7174c1164184cb4c.png" type="image/png">
          <img
            src="/vcf-import/vcf01_hu_7174c1164184cb4c.png"alt="ESXi Cluster"width="1257"
            height="550"/>
        </picture></a><figcaption><p>ESXi Cluster with NFS Storage (click to enlarge)</p></figcaption></figure>
<h2 id="convert-the-cluster-to-a-management-domain">Convert the cluster to a management domain</h2>
<p>After I have prepared my cluster, the real fun can begin. To do this, I first need to download the SDDC Manager, the VCF Import Tool and the appropriate NSX bundle.
In my case, it was the following software versions:</p>
<ul>
<li>VCF-SDDC-Manager-Appliance-5.2.1.1-24397777.ova</li>
<li>VMware Software Install Bundle - NSX_T_MANAGER 4.2.1.0</li>
<li>VCF Import Tool 5.2.1.2</li>
</ul>
<p>The software can be found in the Broadcom support portal under Tools and Drivers in the corresponding VCF 5.2.1 category.
First, I deploy the SDDC Manager appliance to my cluster. The SDDC Manager will later deploy our NSX managers. The SDDC Manager will deploy NSX to the same network and storage where it is located. Network and storage cannot be customized during the deployment of the NSX managers.
After that, I create a JSON file for the future NSX cluster. Here is my example file:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-json" data-lang="json"><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;license_key&#34;</span><span class="p">:</span> <span class="s2">&#34;XXXX-XXXX-XXXX-XXXX-XXXX&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;form_factor&#34;</span><span class="p">:</span> <span class="s2">&#34;medium&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;admin_password&#34;</span><span class="p">:</span> <span class="s2">&#34;xxx&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;install_bundle_path&#34;</span><span class="p">:</span> <span class="s2">&#34;/nfs/vmware/vcf/nfs-mount/bundle/bundle-133764.zip&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;cluster_ip&#34;</span><span class="p">:</span> <span class="s2">&#34;10.28.0.10&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;cluster_fqdn&#34;</span><span class="p">:</span> <span class="s2">&#34;vcf03-m01-nsx01.lab.home&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;manager_specs&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;fqdn&#34;</span><span class="p">:</span> <span class="s2">&#34;vcf03-m01-nsx01a.lab.home&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;name&#34;</span><span class="p">:</span> <span class="s2">&#34;vcf03-m01-nsx01a&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;ip_address&#34;</span><span class="p">:</span> <span class="s2">&#34;10.28.0.11&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;gateway&#34;</span><span class="p">:</span> <span class="s2">&#34;10.28.0.1&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;subnet_mask&#34;</span><span class="p">:</span> <span class="s2">&#34;255.255.255.0&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="p">},</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;fqdn&#34;</span><span class="p">:</span> <span class="s2">&#34;vcf03-m01-nsx01b.lab.home&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;name&#34;</span><span class="p">:</span> <span class="s2">&#34;vcf03-m01-nsx01b&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;ip_address&#34;</span><span class="p">:</span> <span class="s2">&#34;10.28.0.14&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;gateway&#34;</span><span class="p">:</span> <span class="s2">&#34;10.28.0.1&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;subnet_mask&#34;</span><span class="p">:</span> <span class="s2">&#34;255.255.255.0&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="p">},</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;fqdn&#34;</span><span class="p">:</span> <span class="s2">&#34;vcf03-m01-nsx01c.lab.home&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;name&#34;</span><span class="p">:</span> <span class="s2">&#34;vcf03-m01-nsx01c&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;ip_address&#34;</span><span class="p">:</span> <span class="s2">&#34;10.28.0.15&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;gateway&#34;</span><span class="p">:</span> <span class="s2">&#34;10.28.0.1&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;subnet_mask&#34;</span><span class="p">:</span> <span class="s2">&#34;255.255.255.0&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">  <span class="p">]</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></div>
    <aside class="admonition attention">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" class="feather feather-link" width="24" height="24" viewBox="0 0 24 24"
      fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
      <path d="M10 13a5 5 0 0 0 7.54.54l3-3a5 5 0 0 0-7.07-7.07l-1.72 1.71"></path>
      <path d="M14 11a5 5 0 0 0-7.54-.54l-3 3a5 5 0 0 0 7.07 7.07l1.71-1.71"></path>
   </svg></div><b>NSX Deployment</b>
        </div>
        <div class="admonition-content"><p>The importer tool would like to create a 3 node NSX cluster by default.
By customizing the actual domainmanager properties, you can reduce this to a single node – however, this is then no longer supported by Broadcom and should not be used in a production environment.
It is not enough to just customize the JSON file.</p>
<p>In addition, I tried it first with the small form factor and my convert was not successful because not all services could be started successfully on the NSX cluster. I have experienced this several times in nested environments. Therefore, I always deploy my NSX manager to medium and after a successful deployment, I manually remove the RAM reservation and reduce the manager to 4 vCPUs and 20GB ram. In my environment, this is the best compromise between resource consumption and functionality and performance.</p>
</div>
    </aside>
<h3 id="tipp-customizing-the-nsx-cluster-size">Tipp: Customizing the NSX cluster size</h3>
<p>First, I log in to my SDDC manager with the VCF user via ssh and switch to the root context. Then I add the
application-prod.properties file with the following addition.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">cat &gt;&gt; /etc/vmware/vcf/domainmanager/application-prod.properties &lt;&lt; EOF
</span></span><span class="line"><span class="cl">nsxt.manager.cluster.size=1 
</span></span><span class="line"><span class="cl">EOF
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">systemctl restart domainmanager.service
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">watch &#39;systemctl status domainmanager.service&#39;
</span></span></code></pre></div><p>After restarting the domain manager services, the changes should take effect and it is possible to deploy an NSX single node cluster. In my current test I did not do this, but in the past I have often done single node nsx cluster vcf deployments in the lab.</p>
<h3 id="upload-software-and-run-precheck">Upload Software and run precheck</h3>
<p>The next step is relatively simple: I use WinSCP to upload the import tool, the NSX upgrade bundle and the  NSX deployment json to my SDDC Manager.
The upgrade bundle is placed in the directory specified in the NSX spec. In my case, it is</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">/nfs/vmware/vcf/nfs-mount/bundle/bundle-133764.zip
</span></span></code></pre></div><p>and I upload the NSX spec and the import tool to</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">/home/vcf
</span></span></code></pre></div><p>Next, uncompress the import tool and you&rsquo;re ready to go.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">tar xvf vcf-brownfield-import-5.2.1.2-24494579.tar.gz
</span></span></code></pre></div><p>To run the precheck, we change to the vcf-browndield-toolset directory and run the precheck with the following command.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">python3 vcf_brownfield.py check --vcenter &#39;&lt;my-vcenter-address&gt;&#39; --sso-user &#39;&lt;my-sso-username&gt;&#39;
</span></span></code></pre></div><p>Should there be any errors in the precheck, the tool generates guardrails files in which you can read about the problem. In my case, the precheck went cleanly and I can now start the actual convert of my management domain.</p>
<p>The command is more or less the same, only a few parameters have to be changed.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">python3 vcf_brownfield.py convert --vcenter &#39;&lt;vcenter-fqdn&gt;&#39; --sso-user &#39;&lt;sso-user&gt;&#39; --domain-name &#39;&lt;wld-domain-name&gt;&#39; --nsx-deployment-spec-path &#39;&lt;nsx-deployment-json-spec-path&gt;&#39;
</span></span></code></pre></div><p>Actions:</p>
<table>
  <thead>
      <tr>
          <th>Action</th>
          <th>Additional Information</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><code>-h, --help</code></td>
          <td>Shows the VCF Import Tool help.</td>
      </tr>
      <tr>
          <td><code>-v, --version</code></td>
          <td>Displays the VCF Import Tool version.</td>
      </tr>
      <tr>
          <td><code>convert</code></td>
          <td>Converts existing vSphere infrastructure into the management domain in SDDC Manager.</td>
      </tr>
      <tr>
          <td><code>check</code></td>
          <td>Checks whether a vCenter is suitable to be imported into SDDC Manager as a workload domain.</td>
      </tr>
      <tr>
          <td><code>import</code></td>
          <td>Imports a vCenter as a VI workload domain into SDDC Manager.</td>
      </tr>
      <tr>
          <td><code>sync</code></td>
          <td>Syncs an imported VI workload domain or a VI workload domain deployed from SDDC Manager. See <em>Manage Workload Domain Configuration Drift Between vCenter Server and SDDC Manager</em>.</td>
      </tr>
      <tr>
          <td><code>deploy-nsx</code></td>
          <td>Deploys NSX Manager as a standalone operation. See <em>Deploy NSX Manager for Workload Domains</em>.</td>
      </tr>
      <tr>
          <td><code>precheck</code></td>
          <td>Runs prechecks on vCenter.</td>
      </tr>
  </tbody>
</table>
<p>Parameter:</p>
<table>
  <thead>
      <tr>
          <th>Parameter</th>
          <th>Additional Information</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><code>--vcenter</code></td>
          <td>Target vCenter Server for the current operation.</td>
      </tr>
      <tr>
          <td><code>--sso-user</code></td>
          <td>SSO administrator user for the target vCenter Server.</td>
      </tr>
      <tr>
          <td><code>--sso-password</code></td>
          <td>SSO administrator password for the target vCenter Server. Used for prevalidation only.</td>
      </tr>
      <tr>
          <td><code>--domain-name</code></td>
          <td>Workload domain name to be assigned to the target environment during convert/import.</td>
      </tr>
      <tr>
          <td><code>--nsx-deployment-spec-path</code></td>
          <td>Absolute path to the NSX deployment spec JSON file.</td>
      </tr>
  </tbody>
</table>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">The tool took about 80 minutes to complete the process on my system. Of course, it always depends on the performance of the environment.
You can view the progress at any time in the SDDC or in the vCenter. In between, there are queries, such as whether you have taken a snapshot at the SDDC manager. So it&rsquo;s not easy to run completely unattended. So grab a coffee and enjoy the show.</div>
    </aside>
<p>After the convert has hopefully been successful, the whole thing should now look like on my screenshot.</p>
<figure><a href="vcf02.png"><picture><source srcset="/vcf-import/vcf02_hu_cc74f50353bd763d.png" type="image/png">
          <img
            src="/vcf-import/vcf02_hu_cc74f50353bd763d.png"alt="Convert successfully"width="2188"
            height="773"/>
        </picture></a><figcaption><p>Convert successfully(click to enlarge)</p></figcaption></figure>
<figure><a href="vcf03.png"><picture><source srcset="/vcf-import/vcf03_hu_e4ee367a6da757fb.png" type="image/png">
          <img
            src="/vcf-import/vcf03_hu_e4ee367a6da757fb.png"alt="MGMT Domain with NFS Storage"width="1489"
            height="534"/>
        </picture></a><figcaption><p>MGMT Domain with NFS Storage(click to enlarge)</p></figcaption></figure>
<p>Congratulations, we now have a supported workload domain with NFS storage.
Of course, we still have some fine-tuning to do. For example, no NSX Edges have been deployed yet and the ESX servers have been made transport nodes, but they do not yet have TEP addresses.
This is work that we have to do manually in the NSX Manager.
We also still need to install licenses in the SDDC manager and setup the software repository so that we can install updates directly via the SDDC if necessary.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>NSX Edge Cluster</b>
        </div>
        <div class="admonition-content">NSX is deployed on the convert without overlay configuration. After manually adjusting the overlay config in the NSX Manager, you have to perform a sync via the import tool, otherwise you cannot deploy edge clusters via the SDDC and cannot use AVNs, which in turn are needed for the Aria Suite.</div>
    </aside>
<p>There are now two options for the future workload domains: 1. import via the import tool (the process is analogous to the convert) or 2. normal deployment via the SDDC manager.</p>
<h2 id="conclusion">Conclusion</h2>
<p>Importing or converting existing ESXi clusters enables us to deploy VCF in a resource-efficient way in the lab and also offers the possibility for PoCs for customers who would like to test VCF but do not have vSAN Ready nodes. The import tool in VCF 5.2 extends the possibilities of VCF and gives us a bit more flexibility in deploying VCF. From my point of view, this is the right way to make VCF more widely accepted by customers. It also gives us the opportunity to become familiar with the product with fewer resources.</p>

    <aside class="admonition warning">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-alert-circle">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="8" x2="12" y2="12"></line>
      <line x1="12" y1="16" x2="12.01" y2="16"></line>
   </svg></div><b>A last word</b>
        </div>
        <div class="admonition-content">I wouldn&rsquo;t recommend importing or converting an existing workload productively just yet. The feature is still quite new and, although many errors are already caught in the prechecks, it&rsquo;s not yet perfect and bulletproof. For customer PoCs, I would always set up a fresh cluster and then import it so that you don&rsquo;t end up with any legacy issues.</div>
    </aside>
]]></content>
		</item>
		
		<item>
			<title>JetKVM - all show and no substance?</title>
			<link>https://sdn-warrior.org/posts/jetkvm/</link>
			<pubDate>Mon, 03 Feb 2025 21:00:00 +0100</pubDate>
			
			<guid>https://sdn-warrior.org/posts/jetkvm/</guid>
			<description><![CDATA[The good, the bad and the ugly - JetKVM]]></description>
			<content type="html"><![CDATA[<h2 id="jetkvm-a-kickstarter-project">JetKVM a Kickstarter Project</h2>
<p>Yes, there we have another kickstarter project. JetKVM advertises itself as a cheap, fast and open-source network KVM solution. It should provide 60FPS with an average latency of 30-60 ms, H264 endoding and an RJ11 extension port. All for a modest $69 - it almost sounds too good to be true. Well, I got hooked by all the influencer videos and ordered two of them right away. Since one of the two arrived as DOA (defect on arrival), I&rsquo;m only now getting to my conclusion and test.</p>
<figure><picture><source srcset="/jetkvm/01_hu_3bd23bd8c8625c79.jpeg" type="image/jpeg">
          <img
            src="/jetkvm/01_hu_3bd23bd8c8625c79.jpeg"alt="JetKVM"width="1200"
            height="900"/>
        </picture><figcaption><p>JetKVM</p></figcaption></figure>
<p>But first things first: what exactly is a KVM? In short, it stands for keyboard, video and mouse. The KVM allows me to control my computer remotely. This is very practical for my NUCs, as they do not have remote management and are permanently installed in my 19-inch rack. Since it is a network KVM, it can be controlled via the network, quite conveniently via the browser. In addition, JetKVM offers the option of a virtual “CD” drive, enabling remote installation of operating systems or data transfer via ISOs.</p>
<h2 id="how-to-get-started">How to get started</h2>
<p>The JetKVM package is relatively small. The slim box contains the JetKVM itself, a very, very short mini HDMI to HDMI cable, a USB A to C cable and a USB C power/data splitter cable. The JetKVM itself has only a small touchscreen display (which cannot be turned off), a mini HDMI port, a USB C port for data and power, a 100 Mbit RJ45 port and an RJ11 extension port for which separate modules will probably be available at a later date, for example to switch on a computer via ATX. That&rsquo;s it.</p>
<h3 id="power-on-the-jetkvm">Power on the JetKVM</h3>
<p>The most common way to power the JetKVM is through its USB-C port, which is connected directly to the computer you&rsquo;re controlling. It is also possible to use a USB-C Y-cable splitter that separates the power and data connections. This allows you to connect one cable to your remote host for data transfer, while powering the device from a separate 5V power supply, such as a phone charger. According to the manufacturer, it is also possible to supply the JetKVM with power via the RJ11 socket – however, I was unable to test this. In addition, an ATX extension is planned that would allow the JetKVM to be supplied with 5V directly via pin headers.</p>
<p>While it looks very good on the power side, it looks rather thin on the network side. You definitely need a network with DHCP. There is currently no way to set a fixed IP address. Perhaps one of the numerous customized firmware versions that can be found on the internet offers such a possibility, but it does not work with the unmodified original software. As is usual with network KVMs, we only have a 100 Mbit port, which I will come back to later.</p>
<p>After the JetKVM has booted up and obtained an IP address via DHCP, the interface greets us in bright white (no dark mode - put on sunglasses) and prompts us to assign a password. Congratulations – you have successfully set up your JetKVM.</p>
<h3 id="settings">Settings</h3>
<p>The settings are still relatively limited at the moment. You can check for updates, hide the mouse cursor (the local machine&rsquo;s cursor, not the one you are controlling), a mouse jiggler that is really super handy and you won&rsquo;t notice negatively, and a few quality settings. It is also relatively practical to adjust the EDID, so you can, for example, specify that you are sitting at a Dell monitor. This can help with incompatibilities. You can also specify a custom EDID. I have not tested the JetKVM Cloud and therefore cannot say anything about it. In principle, the focus was on the essentials. Most annoying is that the Relative Mouse Mode does not yet exist - thus the full screen mode is very unusable, because the mouse is in absolute mode and thus differs between local and remote computer in full screen mode, which makes control unnecessarily difficult.</p>
<figure><a href="02.png"><picture><source srcset="/jetkvm/02_hu_955a2260d3c26187.png" type="image/png">
          <img
            src="/jetkvm/02_hu_955a2260d3c26187.png"alt="JetKVM Settings"width="2526"
            height="1283"/>
        </picture></a><figcaption><p>JetKVM Settings (click to enlarge)</p></figcaption></figure>
<p>Apart from the settings listed, you can enable or disable the local password. Enable Dev Channel updates or even unlock SSH to play around with the firmware and system yourself. There is also a small connection status page where you can see the round-trip time, jitter, packet loss and FPS. As expected, all this was great during my tests, even though my computer was on Wi-Fi – but it&rsquo;s not that difficult to transfer 100 Mbit stably over Wifi 6e.</p>
<h3 id="other-features">Other Features</h3>
<p>Now let&rsquo;s move on to the other features, and there are a few things to criticize. But let&rsquo;s start with the practical one first. There is a virtual keyboard and it always worked well for me. Especially handy if you need to get into a bios or you&rsquo;re a Mac user and can&rsquo;t just press CTRL+ALT+DEL (because you don&rsquo;t have DEL on your mechanical keyboard 😄). Since the JetKVM unfortunately can&rsquo;t turn on a PC remotely without the extension module, a Wake on Lan function was implemented. This worked well for me as long as the JetKVM was in the same VLAN as the PC. Since my Mikrotik switches can do that too, this is rather uninteresting for me and more of a nice to have. But now we come to the biggest annoyance: the virtual CD drive. Currently, only URL Mount is possible as an experimental feature, or ISOs previously uploaded to the JetKVM&rsquo;s internal storage. Streaming an ISO via the browser, as offered by iDRAC or other network KVM solutions, for example, is not yet possible. Uploading to the internal storage (13 GB free) takes a correspondingly long time over 100 Mbit. With a vCenter ISO, the internal storage is then already quite full.</p>
<figure><picture><source srcset="/jetkvm/03_hu_bbc028c7d000effd.png" type="image/png">
          <img
            src="/jetkvm/03_hu_bbc028c7d000effd.png"alt="JetKVM Mount ISO"width="702"
            height="489"/>
        </picture><figcaption><p>JetKVM ISO mount</p></figcaption></figure>
<p>To make matters worse, the whole thing then froze my Mac (the client I control via JetKVM) multiple times. To be fair, I also tested the whole thing with my Windows computer and inserted a Ubuntu Live DVD. Everything worked fine here. You can read from the internal memory at approx. 30 MB/s, which makes installation quite bearable. I hope that this will be improved in the near future and that it will then also work with my Mac at some point. Mounting the drive didn&rsquo;t always work either, and I had to restart the JetKVM once because something got stuck and the ISO was in an undefined state. I&rsquo;ve seen other systems that are more reliable and better.</p>
<h2 id="the-good-the-bad-and-the-ugly">The good, the bad and the ugly</h2>
<h3 id="the-good">The good</h3>
<p>Let&rsquo;s start with the good. The interface is really tidy, very responsive and the latencies are fabulous. The 60 FPS are there and are maintained. The network connection is stable and the boot times are blazingly fast. You can even watch YouTube videos at 60 FPS without judder – crazy. Working is wonderful. If you have several systems that you can&rsquo;t access via RDP or other tools (for reasons 😉), it&rsquo;s great to work with. Writing emails, holding team calls, etc. Everything is easy. Also the price is fantastic. The feel and workmanship are top-notch. I didn&rsquo;t look at the inside. Plus, the software is 100% open source.</p>
<h3 id="the-bad">The bad</h3>
<p>That one of two JetKVMs was DOA and that I wasn&rsquo;t the only one on the manufacturer&rsquo;s Discord server I&rsquo;ll chalk up to ‘the bad’. To its credit, JetKVM&rsquo;s support was great, replacements were free, I could have even kept the defective unit - top-notch. However, I have already read from several that there are problems with a capacitor and therefore the JetKVMs can die and probably a few from the first batches were affected. Maybe I was unlucky, we&rsquo;ll see. The replacement delivery took about 4 weeks and came directly from China via a German selling agent. The lack of relative mouse mode is also quite a downer for me, as it makes working properly in full-screen mode (if you don&rsquo;t have a 16:9 format) quite annoying. In addition, the fact that there are no expansion modules yet to switch computers on and off via ATX is a bit annoying, especially in rack operation – ok, I knew that in advance. Hopefully something will come soon.</p>
<h3 id="the-ugly">The ugly</h3>
<p>Unfortunately, the biggest point of criticism currently has to do with the implementation of the virtual CD drive. Why it doesn&rsquo;t work properly on Mac is a mystery to me. I can live with the fact that URL Mount is still labelled as experimental. But I can&rsquo;t stream the ISO via the browser, which is a real downer. Uploading is painfully slow (no wonder at 100 Mbit) and space is also very limited. For larger installations, I would always prefer a USB stick or a small SSD. Installing an ESXi with it is not a problem, but it&rsquo;s not really fun. I hope this will be fixed soon.</p>
<h2 id="conclusion">Conclusion</h2>
<p>Do I regret the purchase? Is JetKVM a kickstarter cashcrap? Hell no. The software is of course still in beta status and matures with the customer - if we are honest, we are used to nothing else. The power requirement of 0.6 watts in operation is terrific. Overall, I&rsquo;m satisfied and hope for the community or JetKVM for future software updates.</p>
<p>I&rsquo;ll definitely write another article if missing features have been added or the JetKVM has stopped working.
Time will tell whether the saying “all show and no substance” applies.</p>
]]></content>
		</item>
		
		<item>
			<title>Migration with HCX</title>
			<link>https://sdn-warrior.org/posts/migration-with-hcx/</link>
			<pubDate>Tue, 28 Jan 2025 20:00:00 +0100</pubDate>
			
			<guid>https://sdn-warrior.org/posts/migration-with-hcx/</guid>
			<description><![CDATA[Live Migration of Workloads with VMware HCX: A Customer Story]]></description>
			<content type="html"><![CDATA[<h2 id="introduction">Introduction</h2>
<p>I was brought in to help with a customer project that involved a VCF setup and migrating the workload domains to a new VCF deployment. The challenge of this project was that we had to adopt the existing networks without making any changes while also reducing the downtime of the workloads to an absolute minimum. We used HCX and network extension to solve this problem.</p>
<h2 id="what-is-hcx">What is HCX?</h2>
<p>VMware HCX is an application mobility platform designed to simplify application migration, workload redeployment, and disaster recovery in data centers and clouds.</p>
<p>HCX covers several use cases:</p>
<ul>
<li>
<p>Extend Networks with HCX
Seamlessly extend vSphere and NSX network segments and retain the IP and MAC addresses of migrated VMs to accelerate consumption of modernized resources. Network Extension minimizes the need for complicated networking changes.</p>
</li>
<li>
<p>Migrating Virtual Machines
Select from multiple HCX mobility technologies for optimized migrations at scale for both VMware and non-VMware workloads.</p>
</li>
<li>
<p>Virtual Machine Disaster Recovery
Protect data center applications and workloads through asynchronous replication and recovery of virtual machines, as well as integration with the VMware’s Site Recovery Manager suite of features and tools.</p>
</li>
</ul>
<p>In short, HCX is the answer to my customer problem.</p>
<h2 id="first-things-first-what-are-the-requirements-for-hcx">First things first: what are the requirements for HCX?</h2>
<p>Since HCX has been part of VCF since version 5.1.1, many customers have the opportunity to benefit from HCX. For those who want to test VCF, here is the good news: an eval license comes with HCX.

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Eval License</b>
        </div>
        <div class="admonition-content">The eval license has a runtime of 60 days and can migrate a maximum of 20 VMs per migration type.</div>
    </aside></p>
<ul>
<li>
<p>For environments requiring NSX virtual networking, you must install and configure NSX, including integration with the vCenter Server, before deploying HCX Manager.</p>
</li>
<li>
<p>In the destination environment, the NSX Manager must be installed and connected to the vCenter Server.</p>
</li>
<li>
<p>The NSX Manager must be registered during the HCX Manager install with the admin user.</p>
</li>
<li>
<p>If the NSX Manager IP or FQDN uses self-signed certificates, it might be necessary to trust the NSX system manually using the Import Cert by URL interface in the HCX Appliance Management interface.</p>
</li>
<li>
<p>HCX requires an NSX configured with an Overlay Transport Zone.</p>
</li>
<li>
<p>When NSX-T is registered, both Overlay and VLAN segments can be used during the Network Profile creation.</p>
</li>
<li>
<p>In NSX-T deployments, the HCX supports integration with networking objects created with the NSX Simplified UI/API only.</p>
</li>
</ul>
<h3 id="hcx-connector-and-hcx-cloud-installations">HCX Connector and HCX Cloud Installations</h3>
<p>In HCX, there is a notion of an HCX source and an HCX destination environment. Depending on the environment, HCX provides a separate installer.</p>
<ul>
<li>
<p><strong>HCX Connector:</strong>
Use the HCX Connector with the vCenter Server containing the virtual machines that will be migrated.
The HCX Connector is always an HCX source that connects to an HCX Cloud.

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Note</b>
        </div>
        <div class="admonition-content">If the environment is also used as a destination for site pairing and Network Extension, use the HCX Cloud instead.
Installations using OS Assisted Migration require HCX Connector at the source.</div>
    </aside></p>
</li>
<li>
<p><strong>HCX Cloud:</strong>
Use the HCX Cloud installer with the vCenter Server that is the target of site pairing requests, network extensions, and virtual machine migrations. <br>
The HCX Cloud can also serve as the source of a site pair in HCX cloud-to-cloud installations.

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Note</b>
        </div>
        <div class="admonition-content">OS Assisted Migration does not support cloud-to-cloud installations. For OS Assisted Migration, you must have HCX Connector installed as the source.</div>
    </aside></p>
</li>
</ul>

    <aside class="admonition attention">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" class="feather feather-link" width="24" height="24" viewBox="0 0 24 24"
      fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
      <path d="M10 13a5 5 0 0 0 7.54.54l3-3a5 5 0 0 0-7.07-7.07l-1.72 1.71"></path>
      <path d="M14 11a5 5 0 0 0-7.54-.54l-3 3a5 5 0 0 0 7.07 7.07l1.71-1.71"></path>
   </svg></div><b>Attention</b>
        </div>
        <div class="admonition-content">If the HCX Cloud Manager is used at the source, NSX is required at the source.
The HCX Cloud Manager installation carries higher requirements, but it can be both the source and the target for Site Pairing, HCX Network Extension operations and Service Mesh deployments.</div>
    </aside>
<h3 id="compute-ressources">Compute Ressources</h3>
<table>
  <thead>
      <tr>
          <th>Appliance</th>
          <th>vCPU</th>
          <th>Memory</th>
          <th>Disk Space/IOPS</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>HCX Manager</td>
          <td>4</td>
          <td>12 GB</td>
          <td>60 GB</td>
      </tr>
      <tr>
          <td>HCX Interconnect (HCX-IX)</td>
          <td>8</td>
          <td>6 GB</td>
          <td>2 GB</td>
      </tr>
      <tr>
          <td>HCX Network Extension (HCX-NE)</td>
          <td>8</td>
          <td>3 GB</td>
          <td>2 GB</td>
      </tr>
  </tbody>
</table>
<h3 id="network-ressources">Network Ressources</h3>
<p>I need one HCX appliance per workload domain. This needs access to the respective vCenter and the associated NSX manager. In addition, the HCX appliance must be placed in the vMotion network of the respective workload domain. This can be done by routing or the HCX appliance can be given an IP address directly in the vMotion network. I also need at least one uplink VLAN for the HCX appliances to communicate via. This could be done via WAN, for example. For my customer, this is a VLAN in the underlay network.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">Of course, I can&rsquo;t cover all the migration scenarios that are possible with HCX in this article, which is why the requirements also vary. For my scenario at my customer&rsquo;s site, we decided on vMotion migration between the two VCF stacks. In addition, the Network Extension is to be used to minimize downtime.</div>
    </aside>
<h2 id="deploying-and-configuring-hcx-appliance">Deploying and configuring HCX appliance</h2>
<p>After deploying the HCX appliance, it must be initially configured. To do this, you have to log in to the management interface via <strong>(https://hcx-ip-or-fqdn:9443)</strong> using the configured admin credentials.
The HCX appliance must have access to the NSX manager, the vCenter must be configured, the public access URL (in my case only internally resolvable) and the SSO domain (vCenter). After successful configuration, you can log in to the HCX appliance with the vCenter credentials. The local admin account only works on the HCX management interface.</p>
<figure><a href="hcx01.png"><picture><source srcset="/hcx/hcx01_hu_5247a7038c5a31f8.png" type="image/png">
          <img
            src="/hcx/hcx01_hu_5247a7038c5a31f8.png"alt="HCX settings"width="1720"
            height="804"/>
        </picture></a><figcaption><p>HCX settings (click to enlarge)</p></figcaption></figure>
<h2 id="get-started-with-hcx">Get started with HCX</h2>
<p>HCX requires the following: at least one network profile, one compute profile, one or more service mesh and one or more site pairs.</p>
<h3 id="site-pair">Site Pair</h3>
<p>A Site Pair establishes the connection needed for management, authentication, and orchestration of HCX services across a source and destination environment. To create a new site pair, we need the remote HCX URL, which was configured when deploying the HCX appliance. This must be resolvable via the management network. You also need a vCenter user in the target environment who has sufficient permissions. In my lab I use the vsphere.local administrator. This is of course not best practice and should be adapted in a productive environment.</p>
<figure><a href="hcx04.png"><picture><source srcset="/hcx/hcx04_hu_95c2ca414e38174e.png" type="image/png">
          <img
            src="/hcx/hcx04_hu_95c2ca414e38174e.png"alt="HCX Site Pair"width="1496"
            height="641"/>
        </picture></a><figcaption><p>HCX Site Pair (click to enlarge)</p></figcaption></figure>
<h3 id="network-profiles">Network profiles</h3>
<p>After the Site Pair, I set up the network profiles for the interconnect. These profiles allow me to determine the IP addresses and networks. Depending on the setup and service, you need different networks here. In my customer scenario, I can manage with two networks. One profile for management traffic, which I use to access the vCenter and the NSX Manager from environment A. The second network is for the actual uplink connection. In my lab I have 2 uplink networks for testing, so the screenshots may differ slightly. I select the appropriate distributed port group and assign a free IP range. You can define the HCX traffic type, but this is only used as a suggestion in the compute profile. Just because I mark a network profile as an HCX uplink doesn&rsquo;t mean it has to be used as an uplink. It only serves to mark the networks for easier configuration in the compute profile. The vMotion network is routed in my lab. If this is not possible, you can also configure direct access to the vMotion network for the HCX appliance via the network profiles. This is necessary if the vMotion network cannot be routed.</p>
<figure><a href="hcx02.png"><picture><source srcset="/hcx/hcx02_hu_7ee0d1421716fe66.png" type="image/png">
          <img
            src="/hcx/hcx02_hu_7ee0d1421716fe66.png"alt="HCX network profiles settings"width="1504"
            height="821"/>
        </picture></a><figcaption><p>HCX network profiles settings (click to enlarge)</p></figcaption></figure>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Free Ip Adresses</b>
        </div>
        <div class="admonition-content">Note: I need one free IP in the MGMT network for each of the following: Pro Network Extension, HCX Appliane and Servicemesh (3 IPs). In the vMotion network, 1 IP is required per service mesh (if the vMotion network is not routed) and at least 1 uplink IP is required per service mesh. The HCX interface offers a calculator for the IP planning.</div>
    </aside>
<h3 id="compute-profiles">Compute Profiles</h3>
<p>The compute profiles in HCX are used to specify which HCX services, compute resources such as clusters or resource pools or networks are connected. You can also set CPU and memory reservations for the interconnect appliance. Depending on the HCX service, you have to assign several networks, e.g. management network, uplink networks, vMotion network and the network container. In my case, the network container is the overlay transport zone from my source NSX, as I want to migrate machines from one NSX environment to another NSX environment.</p>
<figure><a href="hcx03.png"><picture><source srcset="/hcx/hcx03_hu_64dfef72248e595.png" type="image/png">
          <img
            src="/hcx/hcx03_hu_64dfef72248e595.png"alt="HCX Compute profiles settings"width="1355"
            height="937"/>
        </picture></a><figcaption><p>HCX Compute profiles settings (click to enlarge)</p></figcaption></figure>
<p>The screenshot shows very clearly how the Interconnect appliance is later connected. The uplinks can be VLAN connections within a data center, public WAN connections, MPLS or VPN. There are very few restrictions here. The uplink networks only need to reach each other between the locations. It does not matter whether it is via Layer2 or Layer3.</p>
<h3 id="service-mesh">Service Mesh</h3>
<p>In the service mesh, compute profiles, HCX services and advanced configurations such as MTU for uplink links or the number of appliances for the network extension are defined and assigned. After I have configured the service mesh, the interconnect VMs are automatically deployed at both HCX locations.</p>
<figure><a href="hcx05.png"><picture><source srcset="/hcx/hcx05_hu_d153b505b4e37d7d.png" type="image/png">
          <img
            src="/hcx/hcx05_hu_d153b505b4e37d7d.png"alt="HCX Service Mesh"width="2019"
            height="611"/>
        </picture></a><figcaption><p>HCX Service Mesh (click to enlarge)</p></figcaption></figure>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Service Mesh</b>
        </div>
        <div class="admonition-content">The service mesh cannot be created until both sides of the site pair are configured and have compute and network profiles.</div>
    </aside>
<h2 id="firewall-settings-for-hcx">Firewall Settings for HCX</h2>
<p>After successfully setting up the interconnect compute profile, a firewall port matrix is created that you can easy copy and export over the HCX gui. Here is an example from my lab.</p>
<table>
  <thead>
      <tr>
          <th>Source</th>
          <th>Destination</th>
          <th>Services</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>ANY</td>
          <td>ANY</td>
          <td>UDP(3784)</td>
      </tr>
      <tr>
          <td>192.168.70.30-192.168.70.40</td>
          <td>192.168.70.30-192.168.70.40</td>
          <td>UDP(3784), TCP(8182)</td>
      </tr>
      <tr>
          <td>192.168.70.30-192.168.70.40</td>
          <td>192.168.70.10</td>
          <td>TCP(9555)</td>
      </tr>
      <tr>
          <td>192.168.70.30-192.168.70.40</td>
          <td>192.168.12.100</td>
          <td>TCP(443)</td>
      </tr>
      <tr>
          <td>192.168.70.10</td>
          <td>192.168.70.30-192.168.70.40</td>
          <td>TCP(9443)</td>
      </tr>
      <tr>
          <td>192.168.70.10</td>
          <td>192.168.12.100</td>
          <td>TCP(443)</td>
      </tr>
      <tr>
          <td>192.168.70.10</td>
          <td>192.168.12.203, 192.168.12.204</td>
          <td>TCP(902), TCP(80), TCP(443)</td>
      </tr>
      <tr>
          <td>192.168.12.203, 192.168.12.204</td>
          <td>192.168.70.30-192.168.70.40</td>
          <td>TCP(31031), TCP(32032), TCP(44046)</td>
      </tr>
      <tr>
          <td>192.168.70.30-192.168.70.40</td>
          <td>192.168.12.203, 192.168.12.204</td>
          <td>TCP(902), TCP(80)</td>
      </tr>
      <tr>
          <td>192.168.12.203, 192.168.12.204</td>
          <td>192.168.70.30-192.168.70.40</td>
          <td>TCP(8000)</td>
      </tr>
      <tr>
          <td>192.168.70.30-192.168.70.40</td>
          <td>192.168.12.203, 192.168.12.204</td>
          <td>TCP(8000)</td>
      </tr>
      <tr>
          <td>192.168.70.10</td>
          <td>192.168.70.30-192.168.70.40</td>
          <td>TCP(443), TCP(8123), TCP(9443)</td>
      </tr>
      <tr>
          <td>ANY</td>
          <td>172.21.22.10-172.21.22.100, 172.21.21.10-172.21.21.100</td>
          <td>TCP(5201), UDP(4500), UDP(5201)</td>
      </tr>
  </tbody>
</table>
<h2 id="network-extension">Network Extension</h2>
<p>With the network extension in HCX, layer-2 networks can be extended between the source and target environments without having to change the IP addressing of the workloads. This allows virtual machines to be seamlessly migrated while maintaining their existing network connection and connectivity, and is a prerequisite for a “zero” downtime migration. To create a layer 2 stretch via the network extension, I just have to click on <strong>Network Extension</strong> &raquo; <strong>Create a Network Extension</strong> and select my network that I want to stretch to the other location via HCX. Here I can only choose from networks in my previously configured network container. In my case, these are the networks from my NSX.</p>
<figure><a href="hcx06.png"><picture><source srcset="/hcx/hcx06_hu_c2b473f8297295d1.png" type="image/png">
          <img
            src="/hcx/hcx06_hu_c2b473f8297295d1.png"alt="HCX Network Extension"width="1651"
            height="1043"/>
        </picture></a><figcaption><p>HCX Network Extension (click to enlarge)</p></figcaption></figure>
<p>The stretched  networks are created at the selected T1 router in the target environment and given a prefix. However, the segments are not connected to the T1. The north/south connection is via the T1 and T0 in the source environment. This ensures that no traffic for the VMs is routed into a black hole.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Attention</b>
        </div>
        <div class="admonition-content">For each network that is stretched across the network extension, a network adapter is required on the appliance. Since the appliance is subject to the same limits as all other VMs, a maximum of 10 network adapters can be used. In my case, I lose 1 adapter for management and two adapters for my uplinks. So I can stretch a maximum of 7 NSX networks over this appliance. If I want to stretch more networks, I need another network extension appliance.</div>
    </aside>
<h3 id="how-does-network-extension-work">How does Network Extension work?</h3>
<p>The sink port on the HCX Network Extension serves as the endpoint for the extended Layer 2 traffic between the source and target environments. It is configured on the source and target sides to receive and send the incoming traffic from the extended network. This can be easily seen on the segment ports in the NSX of the HCX appliance. This is where MAC/IP bindings are exchanged between VMs. If you look at the destination side, you will see (provided that the source VMs are generating traffic) the IP/MAC bindings of the VMs on the source segment on the sink port and vice versa for VMs that have already been migrated to the destination side.</p>
<figure><a href="hcx07.png"><picture><source srcset="/hcx/hcx07_hu_e4ffc5a85da85005.png" type="image/png">
          <img
            src="/hcx/hcx07_hu_e4ffc5a85da85005.png"alt="HCX Network Extension Bindings"width="1024"
            height="271"/>
        </picture></a><figcaption><p>HCX Network Extension Bindings (click to enlarge)</p></figcaption></figure>
<p>For this to work, special port profiles are created for the segment ports of the HCX Network Extension appliance that allow MAC learning, MAC changing and unknown unicast flooding. By default, the NSX segment profile would prevent this and block it.</p>
<h2 id="migrating-vms">Migrating VMs</h2>
<p>HCX supports a wide range of migration options. Since I aim for zero downtime, I use vMotion as my migration method. For less critical VMs with a maintenance window, a bulk migration with short downtime is also an option. Here you can plan a mass migration and make a controlled switchover at a certain point in time. To do this, a replica of the VMs is created and incrementally synchronized. During the cutover, the final delta data is synchronized and the VM is switched on again. After the migration, the VM is switched on again at the destination. Of course, there are several variants and not every variant is suitable.</p>
<p>vMotion migration in VMware HCX allows me to migrate virtual machines live and without downtime from a source environment to a target environment. HCX uses the vMotion technology but extends it to include the option of migrating workloads in a scheduled manner and across geographically separate data centers.</p>
<p>To plan the migration, I go to <strong>Services</strong> &raquo; <strong>Migration</strong> and create a <strong>NEW Mobility Group</strong>. Here I can set all the relevant migration settings.</p>
<figure><a href="hcx08.png"><picture><source srcset="/hcx/hcx08_hu_2bcdabbccd023ad1.png" type="image/png">
          <img
            src="/hcx/hcx08_hu_2bcdabbccd023ad1.png"alt="HCX New Mobility Group"width="1627"
            height="1157"/>
        </picture></a><figcaption><p>HCX New Mobility Group (click to enlarge)</p></figcaption></figure>
<p>The settings are relatively self-explanatory. I select my VMs, choose the target cluster, storage and switchover time, and select my migration method. If the VM comes from a network stretched with the Network Extension, the target network is already preselected. I just have to perform the validation and the migration can start. It is also possible to perform a reverse migration.</p>
<figure><a href="hcx09.png"><picture><source srcset="/hcx/hcx09_hu_410dbf9e4a12de66.png" type="image/png">
          <img
            src="/hcx/hcx09_hu_410dbf9e4a12de66.png"alt="HCX Migration"width="1445"
            height="604"/>
        </picture></a><figcaption><p>HCX New Mobility Group (click to enlarge)</p></figcaption></figure>
<p>The migration was successful and there was only one ping interruption during the entire migration. The VM was functional and accessible the whole time. Thanks to the network extension, both east/west and north/south traffic works.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">64 bytes from 10.10.20.20: icmp_seq=34 ttl=60 time=6.854 ms
</span></span><span class="line"><span class="cl">64 bytes from 10.10.20.20: icmp_seq=35 ttl=60 time=2.144 ms
</span></span><span class="line"><span class="cl">Request timeout for icmp_seq 36
</span></span><span class="line"><span class="cl">64 bytes from 10.10.20.20: icmp_seq=37 ttl=60 time=11.416 ms
</span></span><span class="line"><span class="cl">64 bytes from 10.10.20.20: icmp_seq=38 ttl=60 time=4.916 ms
</span></span><span class="line"><span class="cl">64 bytes from 10.10.20.20: icmp_seq=39 ttl=60 time=3.661 ms
</span></span><span class="line"><span class="cl">64 bytes from 10.10.20.20: icmp_seq=40 ttl=60 time=3.842 ms
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">--- 10.10.20.20 ping statistics ---
</span></span><span class="line"><span class="cl">57 packets transmitted, 56 packets received, 1.8% packet loss
</span></span></code></pre></div><p>After all VMs have been migrated from the source to the target, the network extension for the network via HCX can be removed. If desired, HCX can automatically connect the segments at the target to the T1 router. The segments at the source must then be manually disconnected from the source T1. As long as the same segment subnet is connected to the source and target at the local T1, routing problems may occur. This can be successfully prevented with local preference and AS-Path prepend, by making the segments on the source side less favorable if BGP is used. This only affects north/south connectivity. East/west connectivity is not affected.</p>
<h2 id="conclusion">Conclusion</h2>
<p>I am aware that I have not even touched on all the functions of HCX. The tool is extremely powerful and you always have to look at the scenario for which you are using the right HCX use case. So I have not addressed the WAN optimization or the MON feature, nor have I gone into detail about the other migration options. The whole thing would go beyond the scope by far and I wanted to write about a scenario that I have already implemented for a customer. I can advise everyone to take a closer look at HCX. With HCX, you have a powerful migration tool that can help you out of unpleasant situations or simply simplifies mass migrations from A to B. Since HCX is now part of the VCF product portfolio, there is no reason not to use the tool.</p>
]]></content>
		</item>
		
		<item>
			<title>How Apply To works in NSX DFW</title>
			<link>https://sdn-warrior.org/posts/nsx-apply-to/</link>
			<pubDate>Sat, 11 Jan 2025 02:00:00 +0100</pubDate>
			
			<guid>https://sdn-warrior.org/posts/nsx-apply-to/</guid>
			<description><![CDATA[How Apply To works in NSX DFW and you can use it.]]></description>
			<content type="html"><![CDATA[<h2 id="introduction">Introduction</h2>
<p>When working with the <strong>NSX Distributed Firewall (DFW)</strong>, one feature that often goes unnoticed or misunderstood is <strong>&lsquo;Apply To&rsquo;</strong>. Despite its importance, it is frequently underestimated or even ignored. This is unfortunate, as <strong>&lsquo;Apply To&rsquo;</strong> is a powerful feature that can significantly influence how firewall rules are applied within an NSX environment.</p>
<p>In many VMware training courses, <strong>&lsquo;Apply To&rsquo;</strong> is either poorly explained or not mentioned at all. As a result, administrators and engineers might miss out on opportunities to optimize their firewall rule configurations. Misunderstanding or neglecting this feature can lead to overly complex rulesets or unexpected behavior in distributed environments.</p>
<p>In this post, I aim to demystify the <strong>&lsquo;Apply To&rsquo;</strong> feature. I will explore its functionality, demonstrate its practical use cases, and analyze how it impacts the behavior of firewall rules. By the end, you should have a solid understanding of how and when to use <strong>&lsquo;Apply To&rsquo;</strong> effectively in your NSX environment.</p>
<h2 id="questions-to-address">Questions to Address</h2>
<p>To fully understand the <strong>&lsquo;Apply To&rsquo;</strong> feature in the NSX Distributed Firewall, it is essential to examine its behavior under various scenarios. In this post, I will address the following questions:</p>
<ol>
<li>
<p><strong>What happens if we use the <code>&lt;Applied To&gt;</code> field on the policy and set the <code>&lt;Applied To&gt;</code> rule as DFW?</strong><br>
How does this configuration affect the scope and enforcement of firewall rules?</p>
</li>
<li>
<p><strong>What will happen if we use a Security Group in the <code>&lt;Applied To&gt;</code> field in the rule, while the <code>&lt;Applied To&gt;</code> field in the policy is set as DFW?</strong><br>
What interactions or overlaps should be expected between these configurations?</p>
</li>
<li>
<p><strong>What happens if we use <code>security group1</code> in the policy <code>&lt;Applied To&gt;</code> field and <code>security group2</code> in the rule <code>&lt;Applied To&gt;</code> field?</strong><br>
How do these overlapping or conflicting settings impact the rule application?</p>
</li>
<li>
<p><strong>Does the <code>&lt;Applied To&gt;</code> field in the rule or the <code>&lt;Applied To&gt;</code> field in the policy take precedence?</strong><br>
When both are defined, which one ultimately dictates the scope of rule enforcement?</p>
</li>
</ol>
<p>By exploring these scenarios, I aim to clarify the nuanced behavior of the <strong>&lsquo;Apply To&rsquo;</strong> feature and provide actionable insights for optimizing your NSX DFW configurations.</p>
<h2 id="practical-demonstrations-with-cli">Practical Demonstrations with CLI</h2>
<p>In addition to exploring the theoretical aspects of the <strong>&lsquo;Apply To&rsquo;</strong> feature, I will use practical CLI commands to demonstrate where the rules are actually enforced within the NSX infrastructure. This hands-on approach will help you visualize and verify how <strong>&lsquo;Apply To&rsquo;</strong> configurations are realized in practice.</p>
<p>By combining both conceptual explanations and practical examples, you will gain a deeper understanding of how to effectively use the <strong>&lsquo;Apply To&rsquo;</strong> feature in real-world scenarios.</p>
<h2 id="test-setup">Test Setup</h2>
<p>For this blog, I am using <strong>NSX version 4.2.2.1</strong> to demonstrate the behavior of the <strong>&lsquo;Apply To&rsquo;</strong> feature in the Distributed Firewall. The goal is to keep the setup simple yet effective for understanding how different configurations influence the enforcement of firewall rules.</p>
<h3 id="environment-overview">Environment Overview</h3>
<p>The test environment consists of three lightweight Linux virtual machines running Alpine:</p>
<ul>
<li><strong>Alpine01</strong>: <code>10.10.20.10</code></li>
<li><strong>Alpine02</strong>: <code>10.10.20.20</code></li>
<li><strong>Alpine03</strong>: <code>10.10.20.30</code> (Control VM)</li>
</ul>
<p>All VMs reside on the <strong>same NSX segment</strong>, simplifying the network to focus on rule behavior and ensuring consistent connectivity between VMs. Notably, <strong>Alpine03</strong> serves as a control VM and is not part of any NSX Security Group. This ensures it remains unaffected by specific <strong>&lsquo;Apply To&rsquo;</strong> configurations, providing a baseline for comparison.</p>
<h3 id="security-groups">Security Groups</h3>
<p>To organize and manage the test VMs, I have created the following NSX Security Groups:</p>
<ol>
<li>
<p><strong>dFG_all_Alpine</strong><br>
This group includes <strong>Alpine01</strong> and <strong>Alpine02</strong>. It represents all Alpine VMs involved in the main testing scenarios.</p>
</li>
<li>
<p><strong>dFG_Alpine01</strong><br>
A dedicated Security Group for <strong>Alpine01</strong>, enabling granular control over rules specific to this VM.</p>
</li>
<li>
<p><strong>dFG_Alpine02</strong><br>
A separate Security Group for <strong>Alpine02</strong>, allowing isolated configurations tailored to this VM.</p>
</li>
</ol>
<p><strong>Alpine03</strong> does not belong to any Security Group, making it a neutral VM for verifying how rules or configurations behave when no specific policies are applied. This provides a clear reference point to validate the impact of <strong>&lsquo;Apply To&rsquo;</strong> settings.</p>
<h3 id="why-alpine03-as-a-control-vm">Why Alpine03 as a Control VM?</h3>
<p>Having a control VM like <strong>Alpine03</strong> ensures we have a clean baseline to compare against during the tests. By keeping it outside of any Security Group, we can confirm that any observed effects are solely due to the configurations applied to <strong>Alpine01</strong> and <strong>Alpine02</strong>. This approach eliminates ambiguity and helps highlight the true behavior of the <strong>&lsquo;Apply To&rsquo;</strong> feature.</p>
<h3 id="additional-tools-for-validation">Additional Tools for Validation</h3>
<p>To analyze the behavior further, I will use CLI commands such as <strong><code>summarize-dvfilter</code></strong> and <strong><code>vsipioctl getrules</code></strong>, as well as the NSX <strong>Traceflow</strong> tool. This combination ensures an in-depth understanding of where and how rules are enforced, both at the hypervisor and the network level.</p>
<h3 id="naming-convention-a-recommended-best-practice">Naming Convention: A Recommended Best Practice</h3>
<p>As a personal convention, I prefix all custom Security Groups with <strong>dFG</strong>, which stands for Distributed Firewall Group. This prefix helps me quickly identify and differentiate groups used for NSX Distributed Firewall purposes from other objects in the environment.</p>
<p>I strongly recommend adopting a consistent and meaningful naming scheme for your NSX environment. A clear structure not only improves day-to-day management but also prevents confusion in larger setups with potentially hundreds of objects. Whether you use prefixes like <code>dFG</code> or other naming conventions, the key is consistency.</p>
<h3 id="why-use-a-single-esxi-host">Why Use a Single ESXi Host?</h3>
<p>Running both Alpine VMs on a single ESXi host is intentional. It allows me to leverage the <strong>ESXi CLI</strong> to show where and how the Distributed Firewall rules are realized during the tests. This provides deeper insights into the inner workings of the NSX DFW, bridging the gap between configuration in the NSX Manager and actual enforcement at the hypervisor level.</p>
<p>This setup offers a practical foundation to explore and analyze the <strong>&lsquo;Apply To&rsquo;</strong> feature in detail, combining theoretical explanations with real-world CLI examples.</p>
<h3 id="verification-with-cli-tools">Verification with CLI Tools</h3>
<p>To complement the theoretical understanding of the <strong>&lsquo;Apply To&rsquo;</strong> feature, I will use practical CLI tools during the tests to demonstrate where and how firewall rules are enforced. This includes inspecting the ESXi host to see how the Distributed Firewall implements the configurations. Below are the CLI commands and tools I will use:</p>
<ol>
<li>
<p><strong><code>summarize-dvfilter | grep -A16 &lt;VMName&gt;</code></strong><br>
This command retrieves detailed information about the virtual NIC (vNIC) associated with a specific VM. By identifying the correct vNIC name (e.g., <code>&lt;nic-XXXXXXX-eth0-vmware-sfw.2&gt;</code>), we can pinpoint the interface where the firewall rules are applied.</p>
</li>
<li>
<p><strong><code>vsipioctl getrules -f &lt;name from the vNIC&gt;</code></strong><br>
Once the vNIC name is identified, this command provides the complete list of firewall rules applied to that specific VM. This is a powerful way to verify the exact rules enforced at the hypervisor level.</p>
</li>
<li>
<p><strong>NSX Traceflow Tool</strong><br>
In addition to ESXi CLI commands, the NSX <strong>Traceflow</strong> tool can be used to simulate and trace packet flow through the network. This tool helps visualize the path packets take and how firewall rules affect traffic between VMs.</p>
</li>
</ol>
<h3 id="practical-workflow">Practical Workflow</h3>
<p>During the tests, the workflow will involve the following steps:</p>
<ol>
<li>Use the <strong><code>summarize-dvfilter</code></strong> command to locate the vNIC name for the VM being tested (e.g., Alpine01 or Alpine02).</li>
<li>Retrieve the enforced firewall rules using <strong><code>vsipioctl getrules</code></strong> with the vNIC name as input.</li>
<li>Validate the observed behavior using the NSX Traceflow tool to ensure that the rules are applied as expected and to simulate specific traffic flows for further analysis.</li>
</ol>
<p>These tools provide a hands-on approach to understanding how the <strong>&lsquo;Apply To&rsquo;</strong> feature works, bridging the gap between configuration and real-world enforcement.</p>
<h2 id="test-1-what-happens-if-we-use-the-applied-to-field-on-the-policy-and-set-the-applied-to-rule-as-dfw">Test 1: What Happens If We Use the <code>&lt;Applied To&gt;</code> Field on the Policy and Set the <code>&lt;Applied To&gt;</code> Rule as DFW?</h2>
<p>To begin our exploration of the <strong>&lsquo;Apply To&rsquo;</strong> feature, we start with a simple scenario. In this test, I create a policy that allows <strong>ICMP traffic</strong> from <strong>any</strong> source to the Security Group <strong>dFG_all_Alpine</strong>. The <strong>&lsquo;Apply To&rsquo;</strong> configuration is as follows:</p>
<ul>
<li>The <strong>rule&rsquo;s <code>&lt;Applied To&gt;</code> field</strong> is set to <strong>DFW (Distributed Firewall)</strong>.</li>
<li>The <strong>policy&rsquo;s <code>&lt;Applied To&gt;</code> field</strong> is set to the Security Group <strong>dFG_all_Alpine</strong>.</li>
</ul>
<h3 id="configuration-details">Configuration Details</h3>
<ol>
<li>
<p><strong>Policy</strong>:</p>
<ul>
<li>Name: <strong>ICMP Allow Test</strong></li>
<li>Source: <strong>Any</strong></li>
<li>Destination: <strong>dFG_all_Alpine</strong></li>
<li>Service: <strong>ICMP</strong></li>
<li>Action: <strong>Allow</strong></li>
<li><code>&lt;Applied To&gt;</code>: <strong>dFG_all_Alpine</strong></li>
</ul>
</li>
<li>
<p><strong>Rule</strong>:</p>
<ul>
<li><code>&lt;Applied To&gt;</code>: <strong>DFW</strong></li>
</ul>
</li>
</ol>
<h3 id="expected-behavior">Expected Behavior</h3>
<p>In this configuration, the <strong>policy&rsquo;s <code>&lt;Applied To&gt;</code> field</strong> limits the scope of the policy to the members of the <strong>dFG_all_Alpine</strong> group.</p>
<p>The expected result is that the rule is <strong>only enforced</strong> for traffic destined to the <strong>dFG_all_Alpine</strong> group (as defined in the policy&rsquo;s <code>&lt;Applied To&gt;</code>) even if the rule is set do <strong>DFW</strong>.</p>
<h3 id="test-results">Test Results</h3>
<p>Using the ESXi CLI, I collected the following details for the vNICs associated with each VM:</p>
<ul>
<li>
<p><strong>Alpine01</strong>:</p>
<ul>
<li>Port: <code>67108887</code></li>
<li>vNIC name: <code>nic-533240-eth0-vmware-sfw.2</code></li>
</ul>
</li>
<li>
<p><strong>Alpine02</strong>:</p>
<ul>
<li>Port: <code>67108888</code></li>
<li>vNIC name: <code>nic-533279-eth0-vmware-sfw.2</code></li>
</ul>
</li>
<li>
<p><strong>Alpine03 (Control VM)</strong>:</p>
<ul>
<li>Port: <code>67108889</code></li>
<li>vNIC name: <code>nic-544799-eth0-vmware-sfw.2</code></li>
</ul>
</li>
</ul>
<h4 id="firewall-rules-observed">Firewall Rules Observed</h4>
<ol>
<li><strong>Alpine01 (nic-533240-eth0-vmware-sfw.2)</strong> and <strong>Alpine02 (nic-533279-eth0-vmware-sfw.2)</strong>:<br>
Both VMs have the following rules applied:</li>
</ol>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">ruleset mainrs {
</span></span><span class="line"><span class="cl"># PRE_FILTER rules
</span></span><span class="line"><span class="cl">rule 10216 at 1 inout protocol tcp strict from any to addrset a34212cb-acb2-49b3-b74c-7683c0345a19 port 22 accept;
</span></span><span class="line"><span class="cl"># FILTER (APP Category) rules
</span></span><span class="line"><span class="cl">rule 10217 at 1 inout protocol icmp from any to addrset a34212cb-acb2-49b3-b74c-7683c0345a19 accept;
</span></span><span class="line"><span class="cl">rule 10217 at 2 inout protocol ipv6-icmp from any to addrset a34212cb-acb2-49b3-b74c-7683c0345a19 accept;
</span></span><span class="line"><span class="cl">rule 2 at 3 inout protocol any from any to any reject with log tag &#39;debug&#39;;
</span></span><span class="line"><span class="cl">}
</span></span></code></pre></div><p>Key Observations:</p>
<p>ICMP traffic is explicitly allowed (rule 10217) for both IPv4 and IPv6 from any source to the address set associated with dFG_all_Alpine.
All other traffic is rejected (rule 2), demonstrating that the rules are scoped correctly to the policy&rsquo;s <!-- raw HTML omitted --> field.
Alpine02 (nic-533279-eth0-vmware-sfw.2):
The rules applied to Alpine02 are identical to those observed for Alpine01</p>
<ol start="2">
<li><strong>Alpine03 (Control VM, nic-544799-eth0-vmware-sfw.2)</strong>:</li>
</ol>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">ruleset mainrs {
</span></span><span class="line"><span class="cl"># FILTER (APP Category) rules
</span></span><span class="line"><span class="cl">rule 2 at 1 inout protocol any from any to any reject with log tag &#39;debug&#39;;
</span></span><span class="line"><span class="cl">}
</span></span></code></pre></div><p>Key Observations:</p>
<p>Only the default reject rule is applied. Since Alpine03 is not a member of dFG_all_Alpine, the policy does not apply to this VM, confirming that the <!-- raw HTML omitted --> field in the policy limits enforcement to the intended scope.</p>
<p>Conclusion for Test 1
The results confirm the expected behavior:</p>
<p>The policy&rsquo;s <!-- raw HTML omitted --> field effectively limits the scope of enforcement to the Security Group dFG_all_Alpine.
Although the rule&rsquo;s <!-- raw HTML omitted --> field is set to DFW, rules are only applied to the VMs within the policy&rsquo;s scope.
The Control VM (Alpine03), which is outside the Security Group, does not have the ICMP allow rule applied, demonstrating the precision of the <!-- raw HTML omitted --> field.</p>
<h2 id="test-2-what-will-happen-if-we-use-a-security-group-in-the-applied-to-field-in-the-rule-and-the-applied-to-field-in-the-policy-is-set-as-dfw">Test 2: What Will Happen If We Use a Security Group in the <code>&lt;Applied To&gt;</code> Field in the Rule and the <code>&lt;Applied To&gt;</code> Field in the Policy Is Set as DFW?</h2>
<p>In this scenario, I configure a policy to allow <strong>ICMP traffic</strong> from <strong>any</strong> source to the <strong>dFG_all_Alpine</strong> Security Group. However, this time I modify the <strong><code>&lt;Applied To&gt;</code> field</strong> as follows:</p>
<ul>
<li>The <strong>policy’s <code>&lt;Applied To&gt;</code> field</strong> is set to <strong>DFW (Distributed Firewall)</strong>.</li>
<li>The <strong>rule’s <code>&lt;Applied To&gt;</code> field</strong> is set to the Security Group <strong>dFG_Alpine01</strong>.</li>
</ul>
<h3 id="configuration-details-1">Configuration Details</h3>
<ol>
<li>
<p><strong>Policy</strong>:</p>
<ul>
<li>Name: <strong>ICMP Allow Test 2</strong></li>
<li>Source: <strong>Any</strong></li>
<li>Destination: <strong>dFG_all_Alpine</strong></li>
<li>Service: <strong>ICMP</strong></li>
<li>Action: <strong>Allow</strong></li>
<li><code>&lt;Applied To&gt;</code>: <strong>DFW</strong></li>
</ul>
</li>
<li>
<p><strong>Rule</strong>:</p>
<ul>
<li><code>&lt;Applied To&gt;</code>: <strong>dFG_Alpine01</strong></li>
</ul>
</li>
</ol>
<h3 id="expected-behavior-1">Expected Behavior</h3>
<p>This configuration introduces a more restrictive scope at the <strong>rule</strong> level:</p>
<ul>
<li>Since the rule’s <code>&lt;Applied To&gt;</code> field is set to <strong>dFG_Alpine01</strong>, it should only be <strong>enforced on the VMs in this group</strong>.</li>
<li>The policy’s broader <code>&lt;Applied To&gt;</code> field (set to <strong>DFW</strong>) may lead you to assume that the policy and its rules are applied everywhere, but actual enforcement should be limited to <strong>dFG_Alpine01</strong> due to the rule-level restriction.</li>
</ul>
<h3 id="results">Results</h3>
<h4 id="firewall-rules-observed-1">Firewall Rules Observed</h4>
<ol>
<li><strong>Alpine01 (nic-533240-eth0-vmware-sfw.2)</strong>:<br>
The following rules were observed:</li>
</ol>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">ruleset mainrs {
</span></span><span class="line"><span class="cl"># PRE_FILTER rules
</span></span><span class="line"><span class="cl">rule 10216 at 1 inout protocol tcp strict from any to addrset a34212cb-acb2-49b3-b74c-7683c0345a19 port 22 accept;
</span></span><span class="line"><span class="cl"># FILTER (APP Category) rules
</span></span><span class="line"><span class="cl">rule 10217 at 1 inout protocol icmp from any to addrset a34212cb-acb2-49b3-b74c-7683c0345a19 accept;
</span></span><span class="line"><span class="cl">rule 10217 at 2 inout protocol ipv6-icmp from any to addrset a34212cb-acb2-49b3-b74c-7683c0345a19 accept;
</span></span><span class="line"><span class="cl">rule 2 at 3 inout protocol any from any to any reject with log tag &#39;debug&#39;;
</span></span><span class="line"><span class="cl">}
</span></span></code></pre></div><p>Key Observations:</p>
<p>The ICMP allow rule is enforced as expected, scoped to Alpine01, which is part of dFG_Alpine01.</p>
<ol start="2">
<li><strong>Alpine02 (nic-533279-eth0-vmware-sfw.2):</strong>
The following rules were observed:</li>
</ol>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">ruleset mainrs {
</span></span><span class="line"><span class="cl"># FILTER (APP Category) rules
</span></span><span class="line"><span class="cl"> rule 2 at 1 inout protocol any from any to any reject with log tag &#39;debug&#39;;
</span></span><span class="line"><span class="cl">}
</span></span></code></pre></div><p>Key Observations:</p>
<p>The ICMP allow rule is not applied to Alpine02, as it is not part of dFG_Alpine01.</p>
<ol start="3">
<li><strong>Alpine03 (Control VM, nic-544799-eth0-vmware-sfw.2):</strong>
The following rules were observed:</li>
</ol>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">ruleset mainrs {
</span></span><span class="line"><span class="cl"> # FILTER (APP Category) rules
</span></span><span class="line"><span class="cl"> rule 2 at 1 inout protocol any from any to any reject with log tag &#39;debug&#39;;
</span></span><span class="line"><span class="cl">}
</span></span></code></pre></div><p>Key Observations:</p>
<p>As expected, no ICMP allow rule is applied since Alpine03 is not part of any relevant Security Group.</p>
<h2 id="test-3-what-happens-if-we-use-security-group1-in-the-policy-applied-to-field-and-group2-in-the-rule-applied-to-field">Test 3: What Happens If We Use Security <code>&lt;group1&gt;</code> in the Policy <code>&lt;Applied To&gt;</code> Field and <code>&lt;group2&gt;</code> in the Rule <code>&lt;Applied To&gt;</code> Field?</h2>
<p>In this test, I explore the interaction between the <code>&lt;Applied To&gt;</code> fields at the policy and rule levels when they reference different Security Groups. The configuration is as follows:</p>
<ul>
<li>The <strong>policy’s <code>&lt;Applied To&gt;</code> field</strong> is set to <strong>dFG_Alpine01</strong>.</li>
<li>The <strong>rule’s <code>&lt;Applied To&gt;</code> field</strong> is set to <strong>dFG_Alpine02</strong>.</li>
<li>The rule allows <strong>ICMP traffic</strong> from <strong>any</strong> source to the <strong>dFG_all_Alpine</strong> Security Group.</li>
</ul>
<h3 id="configuration-details-2">Configuration Details</h3>
<ol>
<li>
<p><strong>Policy</strong>:</p>
<ul>
<li>Name: <strong>ICMP Allow Test 3</strong></li>
<li>Source: <strong>Any</strong></li>
<li>Destination: <strong>dFG_all_Alpine</strong></li>
<li>Service: <strong>ICMP</strong></li>
<li>Action: <strong>Allow</strong></li>
<li><code>&lt;Applied To&gt;</code>: <strong>dFG_Alpine01</strong></li>
</ul>
</li>
<li>
<p><strong>Rule</strong>:</p>
<ul>
<li><code>&lt;Applied To&gt;</code>: <strong>dFG_Alpine02</strong></li>
</ul>
</li>
</ol>
<h3 id="expected-behavior-2">Expected Behavior</h3>
<p>In this configuration, the <strong>policy’s <code>&lt;Applied To&gt;</code> field</strong> restricts the scope of the policy to <strong>dFG_Alpine01</strong>, which includes only <strong>Alpine01</strong>. However, the <strong>rule’s <code>&lt;Applied To&gt;</code> field</strong> is set to <strong>dFG_Alpine02</strong>, which includes only <strong>Alpine02</strong>.</p>
<p>The expected behavior is:</p>
<ol>
<li>The policy’s <code>&lt;Applied To&gt;</code> field should dictate the overall scope, meaning the rule will only be <strong>enforced for members of dFG_Alpine01</strong>.</li>
<li>Since the rule’s <code>&lt;Applied To&gt;</code> field is set to <strong>dFG_Alpine02</strong>, the rule will not be applied to <strong>Alpine02</strong>, as it is outside the policy’s scope.</li>
<li>ICMP traffic from <strong>any source</strong> to <strong>dFG_all_Alpine</strong> will only be allowed for VMs within <strong>dFG_Alpine01</strong>.</li>
</ol>
<h3 id="test-results-1">Test Results</h3>
<h4 id="firewall-rules-observed-2">Firewall Rules Observed</h4>
<ol>
<li><strong>Alpine01 (nic-533240-eth0-vmware-sfw.2)</strong>:</li>
</ol>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">ruleset mainrs {
</span></span><span class="line"><span class="cl"># PRE_FILTER rules
</span></span><span class="line"><span class="cl">rule 10216 at 1 inout protocol tcp strict from any to addrset a34212cb-acb2-49b3-b74c-7683c0345a19 port 22 accept;
</span></span><span class="line"><span class="cl"># FILTER (APP Category) rules
</span></span><span class="line"><span class="cl">rule 10217 at 1 inout protocol icmp from any to addrset a34212cb-acb2-49b3-b74c-7683c0345a19 accept;
</span></span><span class="line"><span class="cl">rule 10217 at 2 inout protocol ipv6-icmp from any to addrset a34212cb-acb2-49b3-b74c-7683c0345a19 accept;
</span></span><span class="line"><span class="cl">rule 2 at 3 inout protocol any from any to any reject with log tag &#39;debug&#39;;
</span></span><span class="line"><span class="cl">}
</span></span><span class="line"><span class="cl">ruleset mainrs_L2 {
</span></span><span class="line"><span class="cl"># FILTER rules
</span></span><span class="line"><span class="cl">rule 1 at 1 inout ethertype any stateless from any to any accept;
</span></span><span class="line"><span class="cl">}
</span></span></code></pre></div><p>Key Observations:</p>
<p>ICMP traffic (IPv4 and IPv6) is explicitly allowed (rule 10217) for Alpine01.
This behavior aligns with the policy&rsquo;s <!-- raw HTML omitted --> field, as Alpine01 is part of dFG_Alpine01.</p>
<ol start="2">
<li><strong>Alpine02 (nic-533279-eth0-vmware-sfw.2):</strong></li>
</ol>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">ruleset mainrs {
</span></span><span class="line"><span class="cl"># PRE_FILTER rules
</span></span><span class="line"><span class="cl">rule 10216 at 1 inout protocol tcp strict from any to addrset a34212cb-acb2-49b3-b74c-7683c0345a19 port 22 accept;
</span></span><span class="line"><span class="cl"># FILTER (APP Category) rules
</span></span><span class="line"><span class="cl">rule 2 at 1 inout protocol any from any to any reject with log tag &#39;debug&#39;;
</span></span><span class="line"><span class="cl">}
</span></span></code></pre></div><p>Key Observations:
No ICMP allow rule is present for Alpine02, as it is not part of dFG_Alpine01, which is referenced in the policy&rsquo;s <!-- raw HTML omitted --> field.
All traffic is rejected at the main ruleset level (rule 2), confirming that the rule-level <!-- raw HTML omitted --> field (set to dFG_Alpine02) does not enforce the rule here.</p>
<ol start="3">
<li><strong>Alpine03 (Control VM, nic-544799-eth0-vmware-sfw.2)</strong>:</li>
</ol>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">ruleset mainrs {
</span></span><span class="line"><span class="cl"># FILTER (APP Category) rules
</span></span><span class="line"><span class="cl">rule 2 at 1 inout protocol any from any to any reject with log tag &#39;debug&#39;;
</span></span><span class="line"><span class="cl">}
</span></span></code></pre></div><p>Key Observations:</p>
<p>The only rule applied at the main ruleset level is a default reject rule (rule 2), which blocks all traffic. This confirms that no ICMP allow rule from the policy or rule is applied to Alpine03, as it is not part of dFG_Alpine01.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Important Note: Double Enforcement in NSX Distributed Firewall</b>
        </div>
        <div class="admonition-content"><p>Even though traffic to <strong>Alpine01</strong> is allowed from <strong>any</strong> source, <strong>Alpine02</strong> cannot ping <strong>Alpine01</strong>. This is because <strong>Alpine02</strong> does not have a corresponding firewall rule permitting ICMP traffic for outbound communication.</p>
<p>In <strong>NSX Distributed Firewall</strong>, firewall rules are evaluated <strong>twice</strong> for traffic between VMs managed by NSX:</p>
<ol>
<li><strong>At the source VM</strong>: Outbound traffic must match a rule that permits it to leave the VM.</li>
<li><strong>At the destination VM</strong>: Inbound traffic must match a rule that permits it to reach the VM.</li>
</ol>
<p>This double-enforcement model ensures precise control over traffic flow but requires careful consideration when designing firewall policies. Both source and destination rules must be configured to allow traffic for successful communication.</p>
<h3 id="using-the-nsx-traceflow-tool">Using the NSX Traceflow Tool</h3>
<p>The <strong>Traceflow Tool</strong> in NSX provides an excellent way to visualize this behavior. By tracing packets, you can observe how traffic is evaluated and where it is blocked or allowed. In this scenario, Traceflow clearly demonstrates that traffic originating from <strong>Alpine02</strong> is blocked at the source due to the absence of an ICMP allow rule, even though the destination (<strong>Alpine01</strong>) has an allow rule.</p>
<p>Understanding this double-enforcement logic is crucial for troubleshooting and optimizing NSX Distributed Firewall configurations.</p>
</div>
    </aside>
<figure><a href="traceflow.png"><picture><source srcset="/nsx-apply-to/traceflow_hu_d615ec3cb2297386.png" type="image/png">
          <img
            src="/nsx-apply-to/traceflow_hu_d615ec3cb2297386.png"alt="Traceflow"width="1335"
            height="849"/>
        </picture></a><figcaption><p>NSX Traceflow (click to enlarge)</p></figcaption></figure>
<h2 id="answering-question-4-does-the-rule-applied-to-field-or-the-policy-applied-to-field-take-precedence">Answering Question 4: Does the Rule <code>&lt;Applied To&gt;</code> Field or the Policy <code>&lt;Applied To&gt;</code> Field Take Precedence?</h2>
<p>Through our tests, we can confidently answer this question: <strong>The policy <code>&lt;Applied To&gt;</code> field takes precedence over the rule <code>&lt;Applied To&gt;</code> field</strong>.</p>
<h3 id="key-findings">Key Findings:</h3>
<ul>
<li>
<p>The <strong>policy’s <code>&lt;Applied To&gt;</code> field</strong> defines the overall scope of enforcement. This means that if a VM or Security Group is excluded by the policy <code>&lt;Applied To&gt;</code> field, no rules from that policy will apply to it, regardless of the rule <code>&lt;Applied To&gt;</code> field.</p>
</li>
<li>
<p>The rule&rsquo;s <code>&lt;Applied To &gt;</code> field cannot further restrict or extend enforcement within the scope defined by the policy. It simply ignored.</p>
</li>
</ul>
<h3 id="evidence">Evidence:</h3>
<ul>
<li>In <strong>Test 3</strong>, we demonstrated that even though the rule <code>&lt;Applied To&gt;</code> field was set to <strong>dFG_Alpine02</strong>, the rule was not applied to <strong>Alpine02</strong> because the policy <code>&lt;Applied To&gt;</code> field limited enforcement to <strong>dFG_Alpine01</strong>.</li>
<li>This behavior clearly shows that the policy <code>&lt;Applied To&gt;</code> field is the deciding factor in scoping firewall rule enforcement.</li>
</ul>
<p>By understanding this precedence, administrators can better design their NSX firewall policies to avoid conflicts or unintended behavior.</p>
<h2 id="why-use-applied-to-in-nsx">Why Use <code>&lt;Applied To&gt;</code> in NSX?</h2>
<p>The <strong><code>&lt;Applied To&gt;</code></strong> field in NSX Distributed Firewall is a powerful tool that enables administrators to optimize rule enforcement and improve overall performance in their environments. While it might seem optional at first glance, there are several key reasons to leverage this feature:</p>
<h3 id="1-optimizing-resource-usage">1. <strong>Optimizing Resource Usage</strong></h3>
<p>By default, Distributed Firewall rules are applied to all ESXi hosts in the cluster, even if they are irrelevant to some workloads. Using the <code>&lt;Applied To&gt;</code> field allows you to:</p>
<ul>
<li><strong>Restrict rule enforcement</strong> to specific VMs, Security Groups, or segments.</li>
<li>Reduce unnecessary rule propagation across unrelated hosts.</li>
<li>Minimize the overhead of processing firewall rules.</li>
</ul>
<h3 id="2-enhancing-rule-clarity-and-management">2. <strong>Enhancing Rule Clarity and Management</strong></h3>
<p>When <code>&lt;Applied To&gt;</code> is used correctly, it provides clear boundaries for where rules are enforced:</p>
<ul>
<li>It helps avoid confusion about which VMs are affected by specific rules.</li>
<li>It simplifies troubleshooting by narrowing down the scope of rule application.</li>
<li>It prevents accidental rule application to unintended workloads.</li>
</ul>
<h3 id="3-improving-security-posture">3. <strong>Improving Security Posture</strong></h3>
<p>Restricting the scope of firewall rules reduces the attack surface:</p>
<ul>
<li>Only the VMs, segments, or groups explicitly defined in the <code>&lt;Applied To&gt;</code> field will be impacted by the rule.</li>
<li>This minimizes the risk of unintentionally exposing unrelated workloads to less restrictive rules.</li>
</ul>
<h3 id="4-avoiding-overlap-and-rule-conflicts">4. <strong>Avoiding Overlap and Rule Conflicts</strong></h3>
<p>In complex environments, rules can overlap or conflict, leading to unexpected behavior. By carefully defining <code>&lt;Applied To&gt;</code> fields:</p>
<ul>
<li>You ensure that rules are scoped to their intended targets, reducing the risk of conflicts.</li>
<li>You can isolate specific rule sets for testing or special cases without affecting unrelated traffic.</li>
</ul>
<h3 id="conclusion">Conclusion</h3>
<p>While the <strong><code>&lt;Applied To&gt;</code></strong> field might add an extra layer of configuration, it plays a vital role in optimizing NSX Distributed Firewall performance, clarity, and security. By carefully designing and applying <code>&lt;Applied To&gt;</code> settings at both the policy and rule levels, administrators can achieve a more efficient, secure, and manageable firewall implementation.
In addition, there is a limit to the number of firewall rules that can be applied to an ESX host and to a virtual NIC. The current limits may change with the NSX version and can be looked up in the Configmax tool from Broadcom.</p>
]]></content>
		</item>
		
		<item>
			<title>More performance trough NSX Edge TEP groups?</title>
			<link>https://sdn-warrior.org/posts/nsx-tep-groups/</link>
			<pubDate>Fri, 03 Jan 2025 12:00:00 +0100</pubDate>
			
			<guid>https://sdn-warrior.org/posts/nsx-tep-groups/</guid>
			<description><![CDATA[How to use Edge TEP groups in NSX]]></description>
			<content type="html"><![CDATA[<h2 id="introduction">Introduction</h2>
<p>My esteemed colleague, Steven Schramm, recently published an excellent article titled <em><a href="https://sdn-techtalk.com/posts/multitep-ha/">Improving NSX Datacenter TEP Performance and Availability - Multi-TEP and TEP Group High Availability</a></em>. This inspired me to explore how TEP Groups influence performance in NSX, specifically focusing on how North/South traffic can benefit from their implementation.</p>
<h2 id="what-are-tep-groups-and-why-are-they-interesting">What Are TEP Groups and Why Are They Interesting?</h2>
<p>With NSX 4.2.1, TEP High Availability (HA) for Edge Transport Nodes was introduced. In addition to the HA feature, the load-sharing behavior was also modified.</p>
<p>Before NSX 4.2.1, each segment was bound to a single TEP interface. This limitation meant that North/South traffic could only utilize the maximum throughput of one physical adapter (ESXi where the Edge VM is realized). With TEP HA and the introduction of TEP Groups, this behavior has changed significantly.</p>
<p>It is worth noting that prior to TEP HA, a Multi-TEP implementation was already available. While this allowed for failover within the TEP network if a physical adapter lost its link, it did not address Layer 2 or Layer 3 issues. For more details on this topic, I recommend reading Steven’s article.</p>
<p>However, TEP HA is not enabled by default and, as of today, can only be activated via the API.</p>
<h2 id="lab-setup">LAB Setup</h2>
<p>For this exploration, I am running NSX 4.2.1 on three Intel NUC Pro devices, each equipped with dual 2.5 Gigabit LAN adapters. My test VMs are pinned to different hosts using DRS rules to ensure separation and accurate testing conditions.</p>
<p>Multi-TEP is configured in the setup, but TEP HA has not yet been enabled.</p>
<p>The test environment includes four Alpine Linux VMs, each connected to the same segment.</p>
<ul>
<li><strong>Alpine1</strong>: <code>10.10.20.10</code></li>
<li><strong>Alpine2</strong>: <code>10.10.20.20</code></li>
<li><strong>Alpine3</strong>: <code>10.10.20.30</code></li>
<li><strong>Alpine4</strong>: <code>10.10.20.40</code></li>
</ul>
<p>These VMs are distributed across two ESXi servers to simulate North/South traffic under real-world conditions. A third ESXi server hosts the <strong>NSX Edge VM</strong>, responsible for North/South traffic.</p>
<p>To evaluate performance, my iPerf target is located on a separate server with a <strong>10 Gb/s connection</strong>, ensuring that the network backbone does not introduce any bottlenecks. This setup provides a robust environment to test TEP HA and its impact on North/South traffic.</p>
<h2 id="baseline-tests-northsouth-capacity">Baseline Tests: North/South Capacity</h2>
<p>To measure the maximum North/South capacity of the setup, I ran iPerf tests simultaneously on all four Alpine VMs. Each VM generated traffic towards the iPerf target server with a 10 Gb/s connection. Below are the individual results:</p>
<ul>
<li><strong>Alpine1</strong>: 554 Mbps</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">[ ID] Interval Transfer Bitrate Retr 
</span></span><span class="line"><span class="cl">[ 5] 0.00-100.00 sec 6.45 GBytes 554 Mbits/sec 727 sender 
</span></span><span class="line"><span class="cl">[ 5] 0.00-100.00 sec 6.45 GBytes 554 Mbits/sec receiver
</span></span></code></pre></div><ul>
<li><strong>Alpine2</strong>: 807 Mbps</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">[ ID] Interval Transfer Bitrate Retr 
</span></span><span class="line"><span class="cl">[ 5] 0.00-100.00 sec 9.40 GBytes 807 Mbits/sec 1713 sender 
</span></span><span class="line"><span class="cl">[ 5] 0.00-100.00 sec 9.39 GBytes 807 Mbits/sec receiver
</span></span></code></pre></div><ul>
<li><strong>Alpine3</strong>: 465 Mbps</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">[ ID] Interval Transfer Bitrate Retr 
</span></span><span class="line"><span class="cl">[ 5] 0.00-100.00 sec 5.42 GBytes 465 Mbits/sec 1196 sender 
</span></span><span class="line"><span class="cl">[ 5] 0.00-100.00 sec 5.41 GBytes 466 Mbits/sec receiver
</span></span></code></pre></div><ul>
<li><strong>Alpine4</strong>: 529 Mbps</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">[ ID] Interval Transfer Bitrate Retr 
</span></span><span class="line"><span class="cl">[ 5] 0.00-100.00 sec 6.16 GBytes 529 Mbits/sec 1010 sender 
</span></span><span class="line"><span class="cl">[ 5] 0.00-100.01 sec 6.16 GBytes 529 Mbits/sec receiver
</span></span></code></pre></div><h3 id="total-throughput">Total Throughput</h3>
<p>The combined total throughput across all four VMs was <strong>2.355 Gbps</strong>, indicating the maximum North/South capacity under the current configuration.</p>
<h2 id="validating-physical-nic-utilization">Validating Physical NIC Utilization</h2>
<p>To monitor the utilization of the Edge VM, we can use <code>esxtop</code> on the ESXi server. By pressing <strong>&ldquo;N&rdquo;</strong>, we can examine the network statistics for the physical NICs (<code>vmnic0</code> and <code>vmnic1</code>) as well as the interfaces of the Edge VM.</p>
<p>My Edge VM is configured with four Fastpath interfaces:</p>
<ul>
<li><strong>fp0-fp1</strong>: Used for TEP traffic.</li>
<li><strong>fp2-fp3</strong>: Used for BGP uplinks.</li>
</ul>
<p>Additionally, <code>esxtop</code> displays the mapping of the Edge VM&rsquo;s interfaces to the respective physical NICs (<code>vmnic</code>). This allows us to verify how traffic is distributed across the available resources and ensures that both TEP and BGP traffic are leveraging the correct network paths.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">  67108895 1054173:edge04.lab.home.eth3       vmnic0 DvsPortset-0 &lt;--(fp2)         
</span></span><span class="line"><span class="cl">  67108897 1054173:edge04.lab.home.eth1       vmnic0 DvsPortset-0 &lt;--(fp0)
</span></span><span class="line"><span class="cl">  67108898 1054173:edge04.lab.home.eth0       vmnic1 DvsPortset-0 &lt;--(MGMT)
</span></span><span class="line"><span class="cl">  67108899 1054173:edge04.lab.home.eth2       vmnic1 DvsPortset-0 &lt;--(fp1)
</span></span><span class="line"><span class="cl">  67108900 1054173:edge04.lab.home.eth4       vmnic1 DvsPortset-0 &lt;--(fp3)   
</span></span></code></pre></div><p>In addition to monitoring this in <code>esxtop</code>, I can observe the activity on my switches. By checking the switch port bandwith statistics, I can determine which physical adapter is actively handling the iPerf traffic and which one is idle. This provides an additional layer of validation for the distribution of traffic across the available pNICs.</p>
<h2 id="configuring-multi-tep-ha">Configuring Multi-TEP HA</h2>
<p>The process of enabling Multi-TEP High Availability (HA) is straightforward. It begins with creating a TEP HA Host Switch Profile. This is done through a simple API call using the <code>PUT</code> method to the following endpoint:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">PUT https://&lt;nsx-policy-manager&gt;/policy/api/v1/infra/host-switch-profiles/nsxvtepha
</span></span></code></pre></div><h3 id="json-payload">JSON Payload</h3>
<p>The following JSON payload needs to be provided in the API request:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-json" data-lang="json"><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;enabled&#34;</span><span class="p">:</span> <span class="s2">&#34;true&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;failover_timeout&#34;</span><span class="p">:</span> <span class="s2">&#34;5&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;auto_recovery&#34;</span><span class="p">:</span> <span class="s2">&#34;true&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;auto_recovery_initial_wait&#34;</span><span class="p">:</span> <span class="s2">&#34;300&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;auto_recovery_max_backoff&#34;</span><span class="p">:</span> <span class="s2">&#34;86400&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;resource_type&#34;</span><span class="p">:</span> <span class="s2">&#34;PolicyVtepHAHostSwitchProfile&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;display_name&#34;</span><span class="p">:</span> <span class="s2">&#34;nsxvtepha&#34;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></div><p>Key Parameters:</p>
<ul>
<li>enabled: Enables TEP HA functionality (true or false).</li>
<li>failover_timeout: Specifies the timeout (in seconds) for failover to occur.</li>
<li>auto_recovery: Enables automatic recovery of TEPs after a failure.</li>
<li>auto_recovery_initial_wait: Time (in seconds) before initiating the first recovery attempt.</li>
<li>auto_recovery_max_backoff: Maximum backoff time (in seconds) for recovery attempts.</li>
<li>display_name: A human-readable name for the profile.</li>
</ul>
<p>This API call creates the TEP HA Host Switch Profile, which can then be applied to the desired transport nodes to enable Multi-TEP HA functionality.</p>
<h2 id="assigning-the-tep-ha-profile">Assigning the TEP HA Profile</h2>
<p>To enable the Multi-TEP HA feature, the created TEP HA profile must be assigned to a <strong>Transport Node Profile</strong>. This assignment ensures that the specified hosts will have the Multi-TEP HA feature enabled.</p>
<p>Steps to Assign the TEP HA Profile:</p>
<ol>
<li><strong>Gather the Transport Node Profile ID</strong>:
Retrieve the ID of the transport node profile that you want to map the TEP HA profile to. Without this ID, you cannot complete the assignment.</li>
</ol>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">GET https://&lt;nsx-policy-manager&gt;/policy/api/v1/infra/host-transport-node-profiles/
</span></span></code></pre></div><ol start="2">
<li><strong>Assign the TEP HA Profile</strong>:
Use the API to update the transport node profile by linking it with the TEP HA profile. The request must specify the IDs of both the transport node profile and the TEP HA profile.</li>
</ol>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">PUT https://&lt;nsx-policy-manager&gt;/policy/api/v1/infra/host-transport-node-profiles/&lt;tnp-id&gt; 
</span></span></code></pre></div><p>Add the following entry to the transport node profile to link it with the TEP HA profile:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-json" data-lang="json"><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;key&#34;</span><span class="p">:</span> <span class="s2">&#34;VtepHAHostSwitchProfile&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;value&#34;</span><span class="p">:</span> <span class="s2">&#34;/infra/host-switch-profiles/nsxvtepha&#34;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></div><p>The full transport node profile looks like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-json" data-lang="json"><span class="line"><span class="cl">  <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;host_switch_spec&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;host_switches&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">          <span class="nt">&#34;host_switch_name&#34;</span><span class="p">:</span> <span class="s2">&#34;NSX_vCompute3&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">          <span class="nt">&#34;host_switch_id&#34;</span><span class="p">:</span> <span class="s2">&#34;50 27 cc 64 fe fc 4b 00-b1 af 91 5d 11 78 b9 06&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">          <span class="nt">&#34;host_switch_type&#34;</span><span class="p">:</span> <span class="s2">&#34;VDS&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">          <span class="nt">&#34;host_switch_mode&#34;</span><span class="p">:</span> <span class="s2">&#34;STANDARD&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">          <span class="nt">&#34;ecmp_mode&#34;</span><span class="p">:</span> <span class="s2">&#34;L3&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">          <span class="nt">&#34;host_switch_profile_ids&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">            <span class="p">{</span>
</span></span><span class="line"><span class="cl">              <span class="nt">&#34;key&#34;</span><span class="p">:</span> <span class="s2">&#34;UplinkHostSwitchProfile&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">              <span class="nt">&#34;value&#34;</span><span class="p">:</span> <span class="s2">&#34;/infra/host-switch-profiles/HostUplink&#34;</span>
</span></span><span class="line"><span class="cl">            <span class="p">},</span>
</span></span><span class="line"><span class="cl">            <span class="p">{</span>
</span></span><span class="line"><span class="cl">              <span class="nt">&#34;key&#34;</span><span class="p">:</span> <span class="s2">&#34;VtepHAHostSwitchProfile&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">              <span class="nt">&#34;value&#34;</span><span class="p">:</span> <span class="s2">&#34;/infra/host-switch-profiles/nsxvtepha&#34;</span>
</span></span><span class="line"><span class="cl">            <span class="p">}</span>
</span></span><span class="line"><span class="cl">          <span class="p">],</span>
</span></span><span class="line"><span class="cl">          <span class="nt">&#34;uplinks&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">            <span class="p">{</span>
</span></span><span class="line"><span class="cl">              <span class="nt">&#34;vds_uplink_name&#34;</span><span class="p">:</span> <span class="s2">&#34;Uplink 1&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">              <span class="nt">&#34;uplink_name&#34;</span><span class="p">:</span> <span class="s2">&#34;Uplink1&#34;</span>
</span></span><span class="line"><span class="cl">            <span class="p">},</span>
</span></span><span class="line"><span class="cl">            <span class="p">{</span>
</span></span><span class="line"><span class="cl">              <span class="nt">&#34;vds_uplink_name&#34;</span><span class="p">:</span> <span class="s2">&#34;Uplink 2&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">              <span class="nt">&#34;uplink_name&#34;</span><span class="p">:</span> <span class="s2">&#34;Uplink2&#34;</span>
</span></span><span class="line"><span class="cl">            <span class="p">}</span>
</span></span><span class="line"><span class="cl">          <span class="p">],</span>
</span></span><span class="line"><span class="cl">          <span class="nt">&#34;is_migrate_pnics&#34;</span><span class="p">:</span> <span class="kc">false</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">          <span class="nt">&#34;ip_assignment_spec&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="nt">&#34;ip_pool_id&#34;</span><span class="p">:</span> <span class="s2">&#34;/infra/ip-pools/tep&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="nt">&#34;resource_type&#34;</span><span class="p">:</span> <span class="s2">&#34;StaticIpPoolSpec&#34;</span>
</span></span><span class="line"><span class="cl">          <span class="p">},</span>
</span></span><span class="line"><span class="cl">          <span class="nt">&#34;cpu_config&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">          <span class="p">],</span>
</span></span><span class="line"><span class="cl">          <span class="nt">&#34;transport_zone_endpoints&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">            <span class="p">{</span>
</span></span><span class="line"><span class="cl">              <span class="nt">&#34;transport_zone_id&#34;</span><span class="p">:</span> <span class="s2">&#34;/infra/sites/default/enforcement-points/default/transport-zones/OVERLAYTZ&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">              <span class="nt">&#34;transport_zone_profile_ids&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">              <span class="p">]</span>
</span></span><span class="line"><span class="cl">            <span class="p">},</span>
</span></span><span class="line"><span class="cl">            <span class="p">{</span>
</span></span><span class="line"><span class="cl">              <span class="nt">&#34;transport_zone_id&#34;</span><span class="p">:</span> <span class="s2">&#34;/infra/sites/default/enforcement-points/default/transport-zones/MVLAN&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">              <span class="nt">&#34;transport_zone_profile_ids&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">              <span class="p">]</span>
</span></span><span class="line"><span class="cl">            <span class="p">}</span>
</span></span><span class="line"><span class="cl">          <span class="p">],</span>
</span></span><span class="line"><span class="cl">          <span class="nt">&#34;not_ready&#34;</span><span class="p">:</span> <span class="kc">false</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">          <span class="nt">&#34;portgroup_transport_zone_id&#34;</span><span class="p">:</span> <span class="s2">&#34;/infra/sites/default/enforcement-points/default/transport-zones/eb370bd3-db11-319c-98ec-585e402bf98c&#34;</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">      <span class="p">],</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;resource_type&#34;</span><span class="p">:</span> <span class="s2">&#34;StandardHostSwitchSpec&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="p">},</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;ignore_overridden_hosts&#34;</span><span class="p">:</span> <span class="kc">false</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;resource_type&#34;</span><span class="p">:</span> <span class="s2">&#34;PolicyHostTransportNodeProfile&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;id&#34;</span><span class="p">:</span> <span class="s2">&#34;45104efd-72bf-4d69-bc24-87d45b03b402&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;display_name&#34;</span><span class="p">:</span> <span class="s2">&#34;HostTNP&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;path&#34;</span><span class="p">:</span> <span class="s2">&#34;/infra/host-transport-node-profiles/45104efd-72bf-4d69-bc24-87d45b03b402&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;relative_path&#34;</span><span class="p">:</span> <span class="s2">&#34;45104efd-72bf-4d69-bc24-87d45b03b402&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;parent_path&#34;</span><span class="p">:</span> <span class="s2">&#34;/infra&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;remote_path&#34;</span><span class="p">:</span> <span class="s2">&#34;&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;unique_id&#34;</span><span class="p">:</span> <span class="s2">&#34;45104efd-72bf-4d69-bc24-87d45b03b402&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;realization_id&#34;</span><span class="p">:</span> <span class="s2">&#34;45104efd-72bf-4d69-bc24-87d45b03b402&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;owner_id&#34;</span><span class="p">:</span> <span class="s2">&#34;1ec3eeb1-8da7-457d-bebe-a8b2b47df7de&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;marked_for_delete&#34;</span><span class="p">:</span> <span class="kc">false</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;overridden&#34;</span><span class="p">:</span> <span class="kc">false</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;_system_owned&#34;</span><span class="p">:</span> <span class="kc">false</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;_protection&#34;</span><span class="p">:</span> <span class="s2">&#34;NOT_PROTECTED&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;_create_time&#34;</span><span class="p">:</span> <span class="mi">1723569644857</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;_create_user&#34;</span><span class="p">:</span> <span class="s2">&#34;admin&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;_last_modified_time&#34;</span><span class="p">:</span> <span class="mi">1732706346575</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;_last_modified_user&#34;</span><span class="p">:</span> <span class="s2">&#34;admin&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;_revision&#34;</span><span class="p">:</span> <span class="mi">3</span>
</span></span><span class="line"><span class="cl">  <span class="p">}</span>
</span></span></code></pre></div><h3 id="important-notes">Important Notes</h3>
<ul>
<li>Ensure the transport node profile is correctly assigned to the desired hosts to enable Multi-TEP HA.</li>
<li>Any misconfiguration or omission of the &ldquo;key&rdquo;: &ldquo;VtepHAHostSwitchProfile&rdquo; entry will result in the inability to activate the TEP HA functionality.</li>
<li>The value field must match the path of the created TEP HA profile.</li>
</ul>
<p>This process is crucial for leveraging the full capabilities of Multi-TEP HA in NSX environments.</p>
<h2 id="enabling-edge-tep-groups">Enabling Edge TEP Groups</h2>
<p>To enable the TEP Group feature on Edge nodes, the global connectivity configuration must be updated. This is achieved by modifying the <code>tep_group_config</code> parameter via an API call.</p>
<p>Use the following API request to enable the TEP Group feature on Edge nodes:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">PUT https://&lt;NSX manager&gt;/policy/api/v1/infra/connectivity-global-config
</span></span></code></pre></div><p>JSON Payload</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-json" data-lang="json"><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// ...
</span></span></span><span class="line"><span class="cl">    <span class="nt">&#34;tep_group_config&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;enable_tep_grouping_on_edge&#34;</span><span class="p">:</span> <span class="kc">true</span>
</span></span><span class="line"><span class="cl">    <span class="p">},</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;resource_type&#34;</span><span class="p">:</span> <span class="s2">&#34;GlobalConfig&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// ...
</span></span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></div><p>The full global config looks like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-json" data-lang="json"><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;fips&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">		<span class="nt">&#34;lb_fips_enabled&#34;</span><span class="p">:</span> <span class="kc">true</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">		<span class="nt">&#34;tls_fips_enabled&#34;</span><span class="p">:</span> <span class="kc">false</span>
</span></span><span class="line"><span class="cl">	<span class="p">},</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;l3_forwarding_mode&#34;</span><span class="p">:</span> <span class="s2">&#34;IPV4_ONLY&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;uplink_mtu_threshold&#34;</span><span class="p">:</span> <span class="mi">9000</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;vdr_mac&#34;</span><span class="p">:</span> <span class="s2">&#34;02:50:56:56:44:52&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;vdr_mac_nested&#34;</span><span class="p">:</span> <span class="s2">&#34;02:50:56:56:44:53&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;allow_changing_vdr_mac_in_use&#34;</span><span class="p">:</span> <span class="kc">false</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;arp_limit_per_gateway&#34;</span><span class="p">:</span> <span class="mi">50000</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;external_gateway_bfd&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">		<span class="nt">&#34;bfd_profile_path&#34;</span><span class="p">:</span> <span class="s2">&#34;/infra/bfd-profiles/default-external-gw-bfd-profile&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">		<span class="nt">&#34;enable&#34;</span><span class="p">:</span> <span class="kc">true</span>
</span></span><span class="line"><span class="cl">	<span class="p">},</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;lb_ecmp&#34;</span><span class="p">:</span> <span class="kc">false</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;remote_tunnel_physical_mtu&#34;</span><span class="p">:</span> <span class="mi">1700</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;physical_uplink_mtu&#34;</span><span class="p">:</span> <span class="mi">9000</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;global_replication_mode_enabled&#34;</span><span class="p">:</span> <span class="kc">false</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;is_inherited&#34;</span><span class="p">:</span> <span class="kc">false</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;site_infos&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">	<span class="p">],</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;tep_group_config&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">		<span class="nt">&#34;enable_tep_grouping_on_edge&#34;</span><span class="p">:</span> <span class="kc">true</span>
</span></span><span class="line"><span class="cl">	<span class="p">},</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;resource_type&#34;</span><span class="p">:</span> <span class="s2">&#34;GlobalConfig&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;id&#34;</span><span class="p">:</span> <span class="s2">&#34;global-config&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;display_name&#34;</span><span class="p">:</span> <span class="s2">&#34;default&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;path&#34;</span><span class="p">:</span> <span class="s2">&#34;/infra/global-config&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;relative_path&#34;</span><span class="p">:</span> <span class="s2">&#34;global-config&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;parent_path&#34;</span><span class="p">:</span> <span class="s2">&#34;/infra&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;remote_path&#34;</span><span class="p">:</span> <span class="s2">&#34;&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;unique_id&#34;</span><span class="p">:</span> <span class="s2">&#34;071c1408-8d73-42ea-b2ad-b85cc43c96b2&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;realization_id&#34;</span><span class="p">:</span> <span class="s2">&#34;071c1408-8d73-42ea-b2ad-b85cc43c96b2&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;owner_id&#34;</span><span class="p">:</span> <span class="s2">&#34;1ec3eeb1-8da7-457d-bebe-a8b2b47df7de&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;marked_for_delete&#34;</span><span class="p">:</span> <span class="kc">false</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;overridden&#34;</span><span class="p">:</span> <span class="kc">false</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;_system_owned&#34;</span><span class="p">:</span> <span class="kc">false</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;_protection&#34;</span><span class="p">:</span> <span class="s2">&#34;NOT_PROTECTED&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;_create_time&#34;</span><span class="p">:</span> <span class="mi">1723479213559</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;_create_user&#34;</span><span class="p">:</span> <span class="s2">&#34;system&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;_last_modified_time&#34;</span><span class="p">:</span> <span class="mi">1735854980053</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;_last_modified_user&#34;</span><span class="p">:</span> <span class="s2">&#34;admin&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;_revision&#34;</span><span class="p">:</span> <span class="mi">5</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></div><h2 id="verifying-changes-checking-edge-node-tep-groups">Verifying Changes: Checking Edge Node TEP Groups</h2>
<p>To ensure that the changes to enable TEP Groups are effective, you can verify the configuration directly on an Edge Node using SSH. The following command provides an overview of logical switches and their associated TEP Groups:</p>
<h3 id="command">Command</h3>
<p>Log in to the Edge Node via SSH and execute:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">get logical-switches
</span></span></code></pre></div><p>Sample Output:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">UUID                                   VNI          ENCAP    TEP_GROUP    NAME                                               GLOBAL_VNI(FED)
</span></span><span class="line"><span class="cl">7f7e8af0-299e-4354-a143-a6a3689db228   74753        GENEVE   293888       transit-rl-aa5420e0-3d2b-4ff7-b00e-5f234c2f7413                
</span></span><span class="line"><span class="cl">0abeab93-66ef-4b41-87b6-64164b450e8d   67587        GENEVE   293888       transit-bp-T1                                                  
</span></span><span class="line"><span class="cl">09243099-ebb7-41ae-bcf4-10e0b833cc24   68609        GENEVE   293888       inter-sr-routing-bp-T0-ECMP                                    
</span></span><span class="line"><span class="cl">6261cda0-558f-4a57-838c-d47c95945c31   71680        GENEVE   293888       T1-dhcp-ls
</span></span></code></pre></div><p>Key Parameters to Verify:</p>
<ul>
<li>TEP_GROUP: The column should display a valid TEP Group ID (e.g., 293888) for all logical switches.</li>
<li>Logical Switch Details: Ensure that all expected logical switches are listed, along with their VNI and encapsulation type (e.g., GENEVE).</li>
</ul>
<p>If the TEP_GROUP column shows values for the logical switches, it confirms that the TEP Group feature is active and functioning as expected. This verification ensures that your configuration changes are effective across the Edge Nodes.</p>
<h2 id="performance-tests-with-tep-groups">Performance Tests with TEP Groups</h2>
<p>To evaluate the impact of TEP Groups on performance, I ran simultaneous iPerf tests on all four Alpine VMs. Below are the individual results:</p>
<ul>
<li><strong>Alpine1</strong>: 1.14 Gbits/sec</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">[ ID] Interval Transfer Bitrate Retr 
</span></span><span class="line"><span class="cl">[ 5] 0.00-100.00 sec 13.3 GBytes 1.14 Gbits/sec 669 sender 
</span></span><span class="line"><span class="cl">[ 5] 0.00-100.00 sec 13.3 GBytes 1.14 Gbits/sec receiver
</span></span></code></pre></div><ul>
<li><strong>Alpine2</strong>: 1.08 Gbits/sec</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">[ ID] Interval Transfer Bitrate Retr 
</span></span><span class="line"><span class="cl">[ 5] 0.00-100.00 sec 12.6 GBytes 1.08 Gbits/sec 774 sender 
</span></span><span class="line"><span class="cl">[ 5] 0.00-100.00 sec 12.6 GBytes 1.08 Gbits/sec receiver
</span></span></code></pre></div><ul>
<li><strong>Alpine3</strong>: 1.10 Gbits/sec</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">[ ID] Interval Transfer Bitrate Retr 
</span></span><span class="line"><span class="cl">[ 5] 0.00-100.00 sec 12.8 GBytes 1.10 Gbits/sec 1002 sender 
</span></span><span class="line"><span class="cl">[ 5] 0.00-100.00 sec 12.8 GBytes 1.10 Gbits/sec receiver
</span></span></code></pre></div><ul>
<li><strong>Alpine4</strong>: 1.11 Gbits/sec</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">[ ID] Interval Transfer Bitrate Retr 
</span></span><span class="line"><span class="cl">[ 5] 0.00-100.00 sec 12.9 GBytes 1.11 Gbits/sec 990 sender 
</span></span><span class="line"><span class="cl">[ 5] 0.00-100.00 sec 12.9 GBytes 1.11 Gbits/sec receiver
</span></span></code></pre></div><h3 id="total-throughput-1">Total Throughput</h3>
<p>The combined total throughput across all four VMs was <strong>4.43 Gbps</strong>, showing a significant improvement over the baseline tests without TEP Groups. This demonstrates the enhanced traffic distribution and performance benefits enabled by the Multi-TEP HA feature.</p>
<h2 id="cross-verification-on-the-switch">Cross-Verification on the Switch</h2>
<p>To further validate the results, I checked the physical interfaces of the ESXi server hosting the Edge VM directly on the switch. The switch statistics confirm that both physical interfaces are actively utilized during the iPerf tests.</p>
<h3 id="observations">Observations</h3>
<ul>
<li>Both physical interfaces (<code>vmnic0</code> and <code>vmnic1</code>) show significant traffic, indicating effective utilization and load balancing.</li>
<li>This behavior aligns with the expected performance of the TEP Groups feature, ensuring that traffic is distributed across multiple interfaces for maximum throughput.</li>
</ul>
<figure><picture><source srcset="/nsx-tep-groups/switch_hu_30bbcc322c81ace6.png" type="image/png">
          <img
            src="/nsx-tep-groups/switch_hu_30bbcc322c81ace6.png"alt="Switch Port View"width="1958"
            height="184"/>
        </picture><figcaption><p>Switch port view</p></figcaption></figure>
<p>The screenshot demonstrates how the Multi-TEP HA configuration efficiently balances the load across both physical NICs, validating the setup and confirming the improvements in traffic handling.</p>
<h2 id="final-thoughts">Final Thoughts</h2>
<p>TEP Groups can be easily integrated into any environment with a Multi-TEP setup without requiring significant modifications. The adjustments are minimal and pose a low risk to production environments.</p>
<p>In addition to the noticeable performance improvements, TEP Groups also provide significantly better High Availability (HA) handling. The performance gains are particularly impactful in environments with fewer NSX segments, where the previous load distribution method was less effective.</p>
<p>Moreover, TEP Groups can deliver higher performance for segments with high traffic loads, especially those previously constrained by the physical uplink&rsquo;s capacity. This makes TEP Groups a valuable enhancement for optimizing both performance and reliability in NSX deployments.</p>
<h2 id="further-resources">Further Resources</h2>
<p>For more details and in-depth explanations about Multi-TEP High Availability and TEP Groups, refer to the following resources:</p>
<ul>
<li><a href="https://sdn-techtalk.com/posts/multitep-ha/">Improving NSX Datacenter TEP Performance and Availability - Multi-TEP and TEP Group High Availability</a></li>
<li><a href="https://techdocs.broadcom.com/us/en/vmware-cis/nsx/vmware-nsx/4-2/administration-guide/host-switches/multi-tep-high-availability.html">VMware NSX Administration Guide: Multi-TEP High Availability</a></li>
</ul>
]]></content>
		</item>
		
		<item>
			<title>Redeploy an NSX Edge VM Appliance</title>
			<link>https://sdn-warrior.org/posts/nsx-edge-redeploy/</link>
			<pubDate>Mon, 30 Dec 2024 15:00:46 +0100</pubDate>
			
			<guid>https://sdn-warrior.org/posts/nsx-edge-redeploy/</guid>
			<description><![CDATA[Quicktip: Redeploy NSX Edge with API]]></description>
			<content type="html"><![CDATA[<h2 id="introduction">Introduction</h2>
<p>There are situations where you might need to redeploy an NSX Edge node. This could happen if an Edge VM becomes non-functional, or if it needs to be relocated within the datacenter—for instance, to a different datastore or compute resource. You might also redeploy to move the node to another network. Of course, the specific reasons for redeployment depend on your enviroment.</p>
<p>It’s important to note that redeployment applies exclusively to existing NSX Edge nodes and can only be performed with an NSX Edge VM appliance.</p>
<h2 id="prerequisites">Prerequisites</h2>
<p>Before redeploying an NSX Edge node, keep the following in mind:</p>
<ul>
<li>
<p>While certain configurations of the NSX Edge transport node payload can be changed, <strong>do not modify</strong> the following settings on the existing NSX Edge node being replaced:</p>
<ul>
<li>Failure domain</li>
<li>Transport node connectivity</li>
<li>Physical NIC configuration</li>
<li>Logical routers</li>
<li>Load balancer allocations</li>
</ul>
</li>
<li>
<p>If the existing NSX Edge node is a physical server or was manually deployed via the vSphere Client, ensure that its connectivity to NSX Manager is down. If connectivity remains active, NSX will prevent the replacement of the existing node with a new one.</p>
</li>
<li>
<p><strong>Autodeployed NSX Edge nodes</strong> will retain hardware version 13. However, starting with NSX 4.0.1.1, redeploying an NSX Edge VM automatically upgrades the new VM to a hardware version compatible with the ESXi host version. For a list of compatible VM hardware versions, refer to VMware KB article <a href="https://kb.vmware.com/s/article/2007240">2007240</a>.</p>
</li>
</ul>
<h2 id="procedure">Procedure</h2>
<p>To redeploy an NSX Edge node, follow these steps:</p>
<ol>
<li>
<p><strong>Check the NSX Edge node</strong></p>
<ul>
<li>Open an SSH session and connect to the NSX Edge console.</li>
<li>Verify the logical routers configured on the NSX Edge node by running the following command in the CLI console:
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-shell" data-lang="shell"><span class="line"><span class="cl">get logical-routers
</span></span></code></pre></div></li>
<li>Power off the NSX Edge node.</li>
</ul>
</li>
<li>
<p><strong>Verify disconnection from NSX Manager:</strong></p>
<ul>
<li>
<p>Use the API to confirm the NSX Edge node is disconnected:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">GET /api/v1/transport-nodes/&lt;edgenode&gt;/state
</span></span></code></pre></div><p>The <code>node_deployment_state</code> should display:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">&#34;node_deployment_state&#34;: {
</span></span><span class="line"><span class="cl">    &#34;state&#34;: &#34;MPA_Disconnected&#34;
</span></span><span class="line"><span class="cl">}
</span></span></code></pre></div><p>A state of <code>MPA_Disconnected</code> indicates that you can proceed with redeployment.</p>
</li>
<li>
<p><strong>Important:</strong> If the <code>node_deployment_state</code> is <code>Node Ready</code>, NSX Manager will block the redeployment and display error <strong>78006</strong>: <em>Manager connectivity to Edge node must be down</em>.</p>
</li>
<li>
<p>Alternatively, check the connectivity state from the Edge Transport Node page in the NSX UI. A disconnected NSX Edge node will show the system message:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">Configuration Error: Edge VM MPA Connectivity is down
</span></span></code></pre></div></li>
</ul>
</li>
<li>
<p><strong>For an auto-deployed NSX Edge node:</strong></p>
<ul>
<li>Use the following API command to retrieve the payload of the transport node:
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">GET /&lt;NSX-Manager-IPaddress&gt;/api/v1/transport-nodes/&lt;edgenode&gt;
</span></span></code></pre></div></li>
<li>Save the output payload for later use. (Output example)</li>
</ul>
</li>
</ol>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-json" data-lang="json"><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;node_id&#34;</span><span class="p">:</span> <span class="s2">&#34;607064c6-dd8d-4576-a2d9-2a73abff38aa&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;host_switch_spec&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">		<span class="nt">&#34;host_switches&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">			<span class="p">{</span>
</span></span><span class="line"><span class="cl">				<span class="nt">&#34;host_switch_name&#34;</span><span class="p">:</span> <span class="s2">&#34;nsxHostSwitch&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">				<span class="nt">&#34;host_switch_id&#34;</span><span class="p">:</span> <span class="s2">&#34;c8e3cfaf-9837-4ee2-8a6f-9055927e6009&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">				<span class="nt">&#34;host_switch_type&#34;</span><span class="p">:</span> <span class="s2">&#34;NVDS&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">				<span class="nt">&#34;host_switch_mode&#34;</span><span class="p">:</span> <span class="s2">&#34;STANDARD&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">				<span class="nt">&#34;ecmp_mode&#34;</span><span class="p">:</span> <span class="s2">&#34;L3&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">				<span class="nt">&#34;host_switch_profile_ids&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">					<span class="p">{</span>
</span></span><span class="line"><span class="cl">						<span class="nt">&#34;key&#34;</span><span class="p">:</span> <span class="s2">&#34;UplinkHostSwitchProfile&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">						<span class="nt">&#34;value&#34;</span><span class="p">:</span> <span class="s2">&#34;1c653cda-9c95-414f-9b97-3d8f7cb192d6&#34;</span>
</span></span><span class="line"><span class="cl">					<span class="p">}</span>
</span></span><span class="line"><span class="cl">				<span class="p">],</span>
</span></span><span class="line"><span class="cl">				<span class="nt">&#34;pnics&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">					<span class="p">{</span>
</span></span><span class="line"><span class="cl">						<span class="nt">&#34;device_name&#34;</span><span class="p">:</span> <span class="s2">&#34;fp-eth0&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">						<span class="nt">&#34;uplink_name&#34;</span><span class="p">:</span> <span class="s2">&#34;Uplink1&#34;</span>
</span></span><span class="line"><span class="cl">					<span class="p">},</span>
</span></span><span class="line"><span class="cl">					<span class="p">{</span>
</span></span><span class="line"><span class="cl">						<span class="nt">&#34;device_name&#34;</span><span class="p">:</span> <span class="s2">&#34;fp-eth1&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">						<span class="nt">&#34;uplink_name&#34;</span><span class="p">:</span> <span class="s2">&#34;Uplink2&#34;</span>
</span></span><span class="line"><span class="cl">					<span class="p">}</span>
</span></span><span class="line"><span class="cl">				<span class="p">],</span>
</span></span><span class="line"><span class="cl">				<span class="nt">&#34;is_migrate_pnics&#34;</span><span class="p">:</span> <span class="kc">false</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">				<span class="nt">&#34;ip_assignment_spec&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">					<span class="nt">&#34;ip_pool_id&#34;</span><span class="p">:</span> <span class="s2">&#34;00743a1f-a1a8-46b8-96a7-e0ebe58d7feb&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">					<span class="nt">&#34;resource_type&#34;</span><span class="p">:</span> <span class="s2">&#34;StaticIpPoolSpec&#34;</span>
</span></span><span class="line"><span class="cl">				<span class="p">},</span>
</span></span><span class="line"><span class="cl">				<span class="nt">&#34;cpu_config&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">				<span class="p">],</span>
</span></span><span class="line"><span class="cl">				<span class="nt">&#34;transport_zone_endpoints&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">					<span class="p">{</span>
</span></span><span class="line"><span class="cl">						<span class="nt">&#34;transport_zone_id&#34;</span><span class="p">:</span> <span class="s2">&#34;1b3a2f36-bfd1-443e-a0f6-4de01abc963e&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">						<span class="nt">&#34;transport_zone_profile_ids&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">							<span class="p">{</span>
</span></span><span class="line"><span class="cl">								<span class="nt">&#34;resource_type&#34;</span><span class="p">:</span> <span class="s2">&#34;BfdHealthMonitoringProfile&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">								<span class="nt">&#34;profile_id&#34;</span><span class="p">:</span> <span class="s2">&#34;52035bb3-ab02-4a08-9884-18631312e50a&#34;</span>
</span></span><span class="line"><span class="cl">							<span class="p">}</span>
</span></span><span class="line"><span class="cl">						<span class="p">]</span>
</span></span><span class="line"><span class="cl">					<span class="p">},</span>
</span></span><span class="line"><span class="cl">					<span class="p">{</span>
</span></span><span class="line"><span class="cl">						<span class="nt">&#34;transport_zone_id&#34;</span><span class="p">:</span> <span class="s2">&#34;a95c914d-748d-497c-94ab-10d4647daeba&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">						<span class="nt">&#34;transport_zone_profile_ids&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">							<span class="p">{</span>
</span></span><span class="line"><span class="cl">								<span class="nt">&#34;resource_type&#34;</span><span class="p">:</span> <span class="s2">&#34;BfdHealthMonitoringProfile&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">								<span class="nt">&#34;profile_id&#34;</span><span class="p">:</span> <span class="s2">&#34;52035bb3-ab02-4a08-9884-18631312e50a&#34;</span>
</span></span><span class="line"><span class="cl">							<span class="p">}</span>
</span></span><span class="line"><span class="cl">						<span class="p">]</span>
</span></span><span class="line"><span class="cl">					<span class="p">}</span>
</span></span><span class="line"><span class="cl">				<span class="p">],</span>
</span></span><span class="line"><span class="cl">				<span class="nt">&#34;pnics_uninstall_migration&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">				<span class="p">],</span>
</span></span><span class="line"><span class="cl">				<span class="nt">&#34;vmk_uninstall_migration&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">				<span class="p">],</span>
</span></span><span class="line"><span class="cl">				<span class="nt">&#34;not_ready&#34;</span><span class="p">:</span> <span class="kc">false</span>
</span></span><span class="line"><span class="cl">			<span class="p">}</span>
</span></span><span class="line"><span class="cl">		<span class="p">],</span>
</span></span><span class="line"><span class="cl">		<span class="nt">&#34;resource_type&#34;</span><span class="p">:</span> <span class="s2">&#34;StandardHostSwitchSpec&#34;</span>
</span></span><span class="line"><span class="cl">	<span class="p">},</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;maintenance_mode&#34;</span><span class="p">:</span> <span class="s2">&#34;DISABLED&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;node_deployment_info&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">		<span class="nt">&#34;deployment_type&#34;</span><span class="p">:</span> <span class="s2">&#34;VIRTUAL_MACHINE&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">		<span class="nt">&#34;deployment_config&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">			<span class="nt">&#34;vm_deployment_config&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">				<span class="nt">&#34;vc_id&#34;</span><span class="p">:</span> <span class="s2">&#34;0adeeac2-42dc-4d5a-a4c4-1890b1174a4e&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">				<span class="nt">&#34;compute_id&#34;</span><span class="p">:</span> <span class="s2">&#34;domain-c18&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">				<span class="nt">&#34;storage_id&#34;</span><span class="p">:</span> <span class="s2">&#34;datastore-30&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">				<span class="nt">&#34;management_network_id&#34;</span><span class="p">:</span> <span class="s2">&#34;dvportgroup-29&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">				<span class="nt">&#34;ipv4_assignment_enabled&#34;</span><span class="p">:</span> <span class="kc">true</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">				<span class="nt">&#34;ipv4_assignment_disabled&#34;</span><span class="p">:</span> <span class="kc">false</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">				<span class="nt">&#34;management_port_subnets&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">					<span class="p">{</span>
</span></span><span class="line"><span class="cl">						<span class="nt">&#34;ip_addresses&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">							<span class="s2">&#34;192.168.12.13&#34;</span>
</span></span><span class="line"><span class="cl">						<span class="p">],</span>
</span></span><span class="line"><span class="cl">						<span class="nt">&#34;prefix_length&#34;</span><span class="p">:</span> <span class="mi">24</span>
</span></span><span class="line"><span class="cl">					<span class="p">}</span>
</span></span><span class="line"><span class="cl">				<span class="p">],</span>
</span></span><span class="line"><span class="cl">				<span class="nt">&#34;default_gateway_addresses&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">					<span class="s2">&#34;192.168.12.1&#34;</span>
</span></span><span class="line"><span class="cl">				<span class="p">],</span>
</span></span><span class="line"><span class="cl">				<span class="nt">&#34;data_network_ids&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">					<span class="s2">&#34;dvportgroup-1001&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">					<span class="s2">&#34;dvportgroup-1002&#34;</span>
</span></span><span class="line"><span class="cl">				<span class="p">],</span>
</span></span><span class="line"><span class="cl">				<span class="nt">&#34;reservation_info&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">					<span class="nt">&#34;memory_reservation&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">						<span class="nt">&#34;reservation_percentage&#34;</span><span class="p">:</span> <span class="mi">100</span>
</span></span><span class="line"><span class="cl">					<span class="p">},</span>
</span></span><span class="line"><span class="cl">					<span class="nt">&#34;cpu_reservation&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">						<span class="nt">&#34;reservation_in_shares&#34;</span><span class="p">:</span> <span class="s2">&#34;HIGH_PRIORITY&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">						<span class="nt">&#34;reservation_in_mhz&#34;</span><span class="p">:</span> <span class="mi">0</span>
</span></span><span class="line"><span class="cl">					<span class="p">}</span>
</span></span><span class="line"><span class="cl">				<span class="p">},</span>
</span></span><span class="line"><span class="cl">				<span class="nt">&#34;resource_allocation&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">					<span class="nt">&#34;cpu_count&#34;</span><span class="p">:</span> <span class="mi">4</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">					<span class="nt">&#34;memory_allocation_in_mb&#34;</span><span class="p">:</span> <span class="mi">8192</span>
</span></span><span class="line"><span class="cl">				<span class="p">},</span>
</span></span><span class="line"><span class="cl">				<span class="nt">&#34;placement_type&#34;</span><span class="p">:</span> <span class="s2">&#34;VsphereDeploymentConfig&#34;</span>
</span></span><span class="line"><span class="cl">			<span class="p">},</span>
</span></span><span class="line"><span class="cl">			<span class="nt">&#34;form_factor&#34;</span><span class="p">:</span> <span class="s2">&#34;MEDIUM&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">			<span class="nt">&#34;node_user_settings&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">				<span class="nt">&#34;cli_username&#34;</span><span class="p">:</span> <span class="s2">&#34;admin&#34;</span>
</span></span><span class="line"><span class="cl">			<span class="p">}</span>
</span></span><span class="line"><span class="cl">		<span class="p">},</span>
</span></span><span class="line"><span class="cl">		<span class="nt">&#34;node_settings&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">			<span class="nt">&#34;hostname&#34;</span><span class="p">:</span> <span class="s2">&#34;edge01-nsx.lab.home&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">			<span class="nt">&#34;search_domains&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">				<span class="s2">&#34;lab.home&#34;</span>
</span></span><span class="line"><span class="cl">			<span class="p">],</span>
</span></span><span class="line"><span class="cl">			<span class="nt">&#34;ntp_servers&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">				<span class="s2">&#34;192.168.12.1&#34;</span>
</span></span><span class="line"><span class="cl">			<span class="p">],</span>
</span></span><span class="line"><span class="cl">			<span class="nt">&#34;dns_servers&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">				<span class="s2">&#34;192.168.11.2&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">				<span class="s2">&#34;192.168.100.254&#34;</span>
</span></span><span class="line"><span class="cl">			<span class="p">],</span>
</span></span><span class="line"><span class="cl">			<span class="nt">&#34;enable_ssh&#34;</span><span class="p">:</span> <span class="kc">true</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">			<span class="nt">&#34;allow_ssh_root_login&#34;</span><span class="p">:</span> <span class="kc">true</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">			<span class="nt">&#34;enable_upt_mode&#34;</span><span class="p">:</span> <span class="kc">false</span>
</span></span><span class="line"><span class="cl">		<span class="p">},</span>
</span></span><span class="line"><span class="cl">		<span class="nt">&#34;resource_type&#34;</span><span class="p">:</span> <span class="s2">&#34;EdgeNode&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">		<span class="nt">&#34;external_id&#34;</span><span class="p">:</span> <span class="s2">&#34;607064c6-dd8d-4576-a2d9-2a73abff38aa&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">		<span class="nt">&#34;ip_addresses&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">			<span class="s2">&#34;192.168.12.13&#34;</span>
</span></span><span class="line"><span class="cl">		<span class="p">],</span>
</span></span><span class="line"><span class="cl">		<span class="nt">&#34;id&#34;</span><span class="p">:</span> <span class="s2">&#34;607064c6-dd8d-4576-a2d9-2a73abff38aa&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">		<span class="nt">&#34;display_name&#34;</span><span class="p">:</span> <span class="s2">&#34;edge01-nsx.lab.home&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">		<span class="nt">&#34;tags&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">		<span class="p">],</span>
</span></span><span class="line"><span class="cl">		<span class="nt">&#34;_revision&#34;</span><span class="p">:</span> <span class="mi">2</span>
</span></span><span class="line"><span class="cl">	<span class="p">},</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;is_overridden&#34;</span><span class="p">:</span> <span class="kc">false</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;failure_domain_id&#34;</span><span class="p">:</span> <span class="s2">&#34;4fc1e3b0-1cd4-4339-86c8-f76baddbaafb&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;resource_type&#34;</span><span class="p">:</span> <span class="s2">&#34;TransportNode&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;id&#34;</span><span class="p">:</span> <span class="s2">&#34;607064c6-dd8d-4576-a2d9-2a73abff38aa&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;display_name&#34;</span><span class="p">:</span> <span class="s2">&#34;edge01-nsx.lab.home&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;tags&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">	<span class="p">],</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;_system_owned&#34;</span><span class="p">:</span> <span class="kc">false</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;_protection&#34;</span><span class="p">:</span> <span class="s2">&#34;NOT_PROTECTED&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;_create_time&#34;</span><span class="p">:</span> <span class="mi">1735544264171</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;_create_user&#34;</span><span class="p">:</span> <span class="s2">&#34;admin&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;_last_modified_time&#34;</span><span class="p">:</span> <span class="mi">1735546418835</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;_last_modified_user&#34;</span><span class="p">:</span> <span class="s2">&#34;admin&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">	<span class="nt">&#34;_revision&#34;</span><span class="p">:</span> <span class="mi">2</span> 
</span></span><span class="line"><span class="cl">  <span class="p">}</span>
</span></span></code></pre></div><ol start="4">
<li><strong>Run the redeploy API command:</strong></li>
</ol>
<p>Prepare the payload:</p>
<p>Paste the payload retrieved earlier in the body of the redeploy API.
Verify the deployment_config section contains details about the target:
Compute manager
Datastore
Network
Ensure these values align with those defined in the node_settings section.</p>
<p>NSX Manager will use the information in the deployment_config section to redeploy the NSX Edge node to the specified location and resources.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">   POST /api/v1/transport-nodes/&lt;transport-node-id&gt;?action=redeploy
</span></span></code></pre></div><h2 id="verifying-redeployment">Verifying Redeployment</h2>
<p>After successfully executing the POST command, the <code>revision_id</code> will increment by one. This indicates that the command was successfully sent to the NSX Manager.</p>
<ol>
<li>
<p><strong>Confirm the redeployment status in vCenter:</strong></p>
<ul>
<li>Log in to vCenter and navigate to the relevant cluster or host where the NSX Edge node is being redeployed.</li>
<li>Check the task and events log for deployment activity related to the NSX Edge VM.</li>
</ul>
</li>
<li>
<p><strong>Validate the NSX Edge node redeployment:</strong></p>
<ul>
<li>Verify that the NSX Edge VM is being newly deployed in the specified location, as defined in the <code>deployment_config</code> section of the API payload.</li>
</ul>
</li>
</ol>
<p>Once the redeployment process is complete, ensure the NSX Edge node is functioning correctly in your NSX environment.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">After the NSX Manager successfully deploys the new NSX Edge node, it <strong>power on the Edge VM automatically</strong>.</div>
    </aside>
]]></content>
		</item>
		
		<item>
			<title>Ansible VLAN deployment with MikroTik</title>
			<link>https://sdn-warrior.org/posts/vlan-automation/</link>
			<pubDate>Tue, 24 Dec 2024 13:00:00 +0100</pubDate>
			
			<guid>https://sdn-warrior.org/posts/vlan-automation/</guid>
			<description><![CDATA[Ansible VLAN deployment with MikroTik]]></description>
			<content type="html"><![CDATA[<h2 id="yet-another-ansible-post">Yet Another Ansible Post</h2>
<p>As the year comes to a close, I can&rsquo;t help but reflect on the progress I&rsquo;ve made with Ansible in the past few weeks. Looking back, there&rsquo;s a lot to be satisfied with. After automating my IPAM system and the startup/shutdown process for my lab, I decided to tackle a long-standing annoyance: deploying VLANs for new lab environments.</p>
<h2 id="goals">Goals</h2>
<p>The main goal is to have a single file that describes all the required VLANs. This file should also allow me to delete or reuse VLANs as needed. Each VLAN must be deployed across three switches and configured as a tagged VLAN on specific ports.</p>
<p>Additionally, there are the peculiarities of MikroTik hardware to consider. When creating new VLANs on my Top-of-Rack (ToR) switch, MikroTik recommends disabling the L3 Hardware Offloading feature beforehand. In the past, I’ve encountered strange issues when this wasn’t done prior to creating new networks. Therefore, the automation should also handle disabling and re-enabling this feature as part of the process.</p>
<h2 id="future-goals">Future Goals</h2>
<p>In the future, I plan to expand this setup further, integrating it with Ansible Tower or ArgoCD to enable management through pipelines.</p>
<p>A pipeline is essentially an automated workflow that takes a defined input—such as a configuration file or a code repository—and processes it through several steps to achieve a desired outcome. For example, in this context, a pipeline could validate my VLAN configuration, deploy it to the target switches, and handle any post-deployment tasks automatically.</p>
<p>My locally hosted Gitea instance will serve as the <strong>Source of Truth</strong>, housing the configuration files and acting as the central repository for all changes. This ensures consistency, version control, and a clear audit trail for every modification.</p>
<h2 id="what-ive-done">What I&rsquo;ve Done</h2>
<p>For this implementation, I approached things a bit differently compared to my last two Ansible projects—after all, learning is part of the process!</p>
<h3 id="project-structure">Project Structure</h3>
<p>I structured the project as follows:</p>
<ul>
<li><strong>INI File</strong>: Stores the credentials for the three switches.</li>
<li><strong>ansible.cfg</strong>: Manages the inventory file and SSH settings.</li>
<li><strong>Playbook</strong>: A YAML file that runs without additional parameters.</li>
<li><strong>Directories</strong>:
<ul>
<li><code>group_vars/</code>: Contains global variables shared across all devices.
<ul>
<li><code>all.yml</code>: Includes all VLANs that should exist, along with descriptions.</li>
</ul>
</li>
<li><code>host_vars/</code>: Contains per-device variables.
<ul>
<li>Each switch has its own YAML file defining interfaces and bridges.</li>
</ul>
</li>
</ul>
</li>
</ul>
<p>Here’s the project structure visualized:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">mikrotik/
</span></span><span class="line"><span class="cl">├── ansible.cfg          # Configuration file for project Ansible settings
</span></span><span class="line"><span class="cl">├── inventory.ini        # Inventory file with switch credentials
</span></span><span class="line"><span class="cl">├── vlan_esx.yml         # Main playbook
</span></span><span class="line"><span class="cl">├── group_vars/
</span></span><span class="line"><span class="cl">│   └── all.yml          # Global VLAN definitions and descriptions
</span></span><span class="line"><span class="cl">└── host_vars/
</span></span><span class="line"><span class="cl">    ├── 192.168.0.1.yml      # Variables for Switch 1 (interfaces, bridges)
</span></span><span class="line"><span class="cl">    ├── 192.168.0.5.yml      # Variables for Switch 2 (interfaces, bridges)
</span></span><span class="line"><span class="cl">    └── 192.168.0.7.yml      # Variables for Switch 3 (interfaces, bridges)
</span></span></code></pre></div><h3 id="advantages-of-this-structure">Advantages of This Structure</h3>
<p>One major advantage of this structure is its scalability. Adding another MikroTik switch is straightforward: I only need to create a new <code>host_var</code> file for the switch and update the <code>inventory.ini</code> file with its credentials.</p>
<p>Additionally, since the playbook runs without requiring extra parameters, it simplifies the GitOps approach I plan to implement later. This means the entire process becomes more streamlined and easily automatable through pipelines, reducing complexity and potential for errors.</p>
<h2 id="project-files">Project Files</h2>
<p>Here are the files I created as part of this project:</p>
<h3 id="ansiblecfg">ansible.cfg</h3>
<p>This file configures Ansible with the necessary inventory and SSH settings.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-ini" data-lang="ini"><span class="line"><span class="cl"><span class="k">[defaults]</span>
</span></span><span class="line"><span class="cl"><span class="na">host_key_checking</span> <span class="o">=</span> <span class="s">False</span>
</span></span><span class="line"><span class="cl"><span class="na">transport</span> <span class="o">=</span> <span class="s">ssh</span>
</span></span><span class="line"><span class="cl"><span class="na">inventory</span> <span class="o">=</span> <span class="s">mikrotik.ini</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">[ssh_connection]</span>
</span></span><span class="line"><span class="cl"><span class="na">ssh_type</span> <span class="o">=</span> <span class="s">libssh</span>
</span></span><span class="line"><span class="cl"><span class="na">timeout</span> <span class="o">=</span> <span class="s">60</span>
</span></span></code></pre></div><h3 id="inventoryini">inventory.ini</h3>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-ini" data-lang="ini"><span class="line"><span class="cl"><span class="k">[mikrotik]</span>
</span></span><span class="line"><span class="cl"><span class="na">192.168.0.1 ansible_user</span><span class="o">=</span><span class="s">admin ansible_password=&#34;xxx&#34; ansible_connection=network_cli ansible_network_os=community.network.routeros</span>
</span></span><span class="line"><span class="cl"><span class="na">192.168.0.5 ansible_user</span><span class="o">=</span><span class="s">admin ansible_password=&#34;xxx&#34; ansible_connection=network_cli ansible_network_os=community.network.routeros </span>
</span></span><span class="line"><span class="cl"><span class="na">192.168.0.7 ansible_user</span><span class="o">=</span><span class="s">admin ansible_password=&#34;xxx&#34; ansible_connection=network_cli ansible_network_os=community.network.routeros</span>
</span></span></code></pre></div><h3 id="vlan_esxyml">vlan_esx.yml</h3>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nn">---</span><span class="w">
</span></span></span><span class="line"><span class="cl">- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Manage VLANs with delete option</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">hosts</span><span class="p">:</span><span class="w"> </span><span class="l">mikrotik</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">gather_facts</span><span class="p">:</span><span class="w"> </span><span class="kc">no</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">collections</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="l">community.network</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">tasks</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Disable L3 HW Offloading</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">community.network.routeros_command</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">commands</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span>- <span class="s2">&#34;/interface ethernet switch set [find name=switch1] l3-hw-offloading=no&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">when</span><span class="p">:</span><span class="w"> </span><span class="l">ansible_host == &#39;192.168.0.1&#39;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">register</span><span class="p">:</span><span class="w"> </span><span class="l">hw_offload_result</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Delete VLANs marked for deletion</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">community.network.routeros_command</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">commands</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span>- <span class="s2">&#34;/interface bridge vlan remove [find vlan-ids={{ item.id }}]&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">with_items</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;{{ vlans }}&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">when</span><span class="p">:</span><span class="w"> </span><span class="l">item.delete | bool</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Configure VLANs on the bridge</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">community.network.routeros_command</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">commands</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span>- <span class="s2">&#34;/interface bridge vlan remove [find vlan-ids={{ item.id }}]&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span>- <span class="s2">&#34;/interface bridge vlan add bridge={{ bridge }} vlan-ids={{ item.id }} tagged={{ interfaces | join(&#39;,&#39;) }}&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">with_items</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;{{ vlans }}&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">when</span><span class="p">:</span><span class="w"> </span><span class="l">not item.delete | bool</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Set description for each VLAN</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">community.network.routeros_command</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">commands</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span>- <span class="s2">&#34;/interface bridge vlan comment [find vlan-ids={{ item.id }}] comment=\&#34;{{ item.description }}\&#34;&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">with_items</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;{{ vlans }}&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">when</span><span class="p">:</span><span class="w"> </span><span class="l">not item.delete | bool</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Enable L3 HW Offloading</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">community.network.routeros_command</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">commands</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span>- <span class="s2">&#34;/interface ethernet switch set [find name=switch1] l3-hw-offloading=yes&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">when</span><span class="p">:</span><span class="w"> </span><span class="l">ansible_host == &#39;192.168.0.1&#39;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">register</span><span class="p">:</span><span class="w"> </span><span class="l">hw_offload_result</span><span class="w">
</span></span></span></code></pre></div><h3 id="group_varsallyml">group_vars/all.yml</h3>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">vlans</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">id</span><span class="p">:</span><span class="w"> </span><span class="m">4</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">description</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;vMotion&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">delete</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">id</span><span class="p">:</span><span class="w"> </span><span class="m">12</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">description</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;ESXi MGMT&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">delete</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">id</span><span class="p">:</span><span class="w"> </span><span class="m">14</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">description</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;NSXB Host Tep&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">delete</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">id</span><span class="p">:</span><span class="w"> </span><span class="m">15</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">description</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;NSXB Edge Tep&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">delete</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">id</span><span class="p">:</span><span class="w"> </span><span class="m">20</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">description</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;K3s&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">delete</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">id</span><span class="p">:</span><span class="w"> </span><span class="m">31</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">description</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;NSXB Uplink1&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">delete</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">id</span><span class="p">:</span><span class="w"> </span><span class="m">41</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">description</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;NSXB Uplink2&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">delete</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">id</span><span class="p">:</span><span class="w"> </span><span class="m">50</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">description</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;RTEP NSX Federation&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">delete</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">id</span><span class="p">:</span><span class="w"> </span><span class="m">69</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">description</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;vSAN&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">delete</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">id</span><span class="p">:</span><span class="w"> </span><span class="m">200</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">description</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;VCF VM MGMT&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">delete</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">id</span><span class="p">:</span><span class="w"> </span><span class="m">201</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">description</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;VCF MGMT&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">delete</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">id</span><span class="p">:</span><span class="w"> </span><span class="m">202</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">description</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;VCF vSAN&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">delete</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">id</span><span class="p">:</span><span class="w"> </span><span class="m">203</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">description</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;VCF vSAN&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">delete</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">id</span><span class="p">:</span><span class="w"> </span><span class="m">204</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">description</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;VCF HostTEP&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">delete</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">id</span><span class="p">:</span><span class="w"> </span><span class="m">205</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">description</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;VCF EdgeTEP&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">delete</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">id</span><span class="p">:</span><span class="w"> </span><span class="m">206</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">description</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34; &#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">delete</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">id</span><span class="p">:</span><span class="w"> </span><span class="m">207</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">description</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34; &#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">delete</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">id</span><span class="p">:</span><span class="w"> </span><span class="m">208</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">description</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34; &#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">delete</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">id</span><span class="p">:</span><span class="w"> </span><span class="m">209</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">description</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34; &#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">delete</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">id</span><span class="p">:</span><span class="w"> </span><span class="m">211</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">description</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34; &#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">delete</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">id</span><span class="p">:</span><span class="w"> </span><span class="m">212</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">description</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34; &#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">delete</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span></code></pre></div><h3 id="host_varsswitch1yml">host_vars/switch1.yml</h3>
<p>Each file defines the specific interfaces and bridges for a switch. Example for 192.168.0.5.yml</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">interfaces</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="l">10_bonding_SWA02</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="l">00_bonding_CoreRouter</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="l">01_ether1_ESX01_1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="l">01_ether2_ESX01_2</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="l">02_qsfpplus1-1_ESX02_1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="l">03_qsfpplus1-2_ESX03_1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="l">04_qsfpplus1-3_ESX04_1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="l">05_qsfpplus1-4_ESX05_1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="l">07_ether3_ESX07_1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="l">07_ether4_ESX07_2</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="l">08_ether5_ESX08_1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="l">08_ether6_ESX08_2 </span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="l">09_ether7_ESX09_1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="l">09_ether8_ESX09_2</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">bridge</span><span class="p">:</span><span class="w"> </span><span class="l">bridge</span><span class="w">
</span></span></span></code></pre></div><h2 id="what-does-the-playbook-do">What Does the Playbook Do?</h2>
<p>This playbook is designed to manage VLANs on MikroTik switches, including the ability to delete or configure VLANs. It uses the <code>community.network</code> collection and skips fact-gathering since it only executes specific commands on the devices.</p>
<h3 id="step-by-step-breakdown">Step-by-Step Breakdown</h3>
<ol>
<li>
<p><strong>Disable L3 Hardware Offloading</strong><br>
The playbook begins by disabling the L3 Hardware Offloading feature on the switch named <code>Tor Switch</code>. This step is necessary because MikroTik recommends turning off this feature before making VLAN configuration changes. The command is executed only if the host IP is <code>192.168.0.1</code>. The result is stored in the variable <code>hw_offload_result</code>.</p>
</li>
<li>
<p><strong>Delete VLANs Marked for Deletion</strong><br>
VLANs that are marked for deletion in the <code>vlans</code> variable (<code>delete: true</code>) are removed. The playbook iterates through the list of VLANs and executes the removal command for each VLAN marked as deleted.</p>
</li>
<li>
<p><strong>Configure VLANs</strong><br>
For VLANs that are not marked for deletion, the playbook configures them as follows:</p>
<ul>
<li>Removes any existing VLAN with the same ID to avoid conflicts.</li>
<li>Adds the VLAN to the specified bridge and assigns it to the interfaces defined as <code>tagged</code>. This ensures a clean and consistent configuration.</li>
</ul>
</li>
<li>
<p><strong>Set VLAN Descriptions</strong><br>
The playbook adds a description to each VLAN that is not marked for deletion. It uses the <code>comment</code> function in MikroTik and sets the description based on the variables provided.</p>
</li>
<li>
<p><strong>Enable L3 Hardware Offloading</strong><br>
Finally, the playbook re-enables the L3 Hardware Offloading feature on <code>ToR Switch</code>, but only if the host IP is <code>192.168.0.1</code>. The result of this step is also stored in the <code>hw_offload_result</code> variable.</p>
</li>
</ol>
<h2 id="summary">Summary</h2>
<p>This playbook automates the entire VLAN management process:</p>
<ul>
<li>Disables the L3 offloading feature when required.</li>
<li>Deletes VLANs marked for removal.</li>
<li>Configures new VLANs, ensuring no conflicts.</li>
<li>Sets descriptions for the VLANs.</li>
<li>Re-enables the L3 offloading feature after the configuration.</li>
</ul>
<p>The structure ensures reliability and consistency, handling edge cases such as existing VLANs and hardware offloading quirks automatically.</p>
<h2 id="whats-left-to-do">What&rsquo;s Left to Do</h2>
<p>There are several improvements I plan to make to this playbook in the future:</p>
<ol>
<li>
<p><strong>Clean Up Variables</strong><br>
Currently, there are some unused variables in the playbook that I need to clean up to keep the codebase tidy and maintainable.</p>
</li>
<li>
<p><strong>Enhanced Logic for VLAN Checks</strong><br>
I want to extend the logic to verify if a VLAN already matches the desired target configuration. This would prevent unnecessary deletion and re-creation of VLANs, reducing downtime and ensuring a smoother operation.</p>
</li>
<li>
<p><strong>Improved Error Handling</strong><br>
Better error handling is a priority to ensure the playbook gracefully recovers from unexpected issues, such as failed commands or unreachable devices.</p>
</li>
<li>
<p><strong>Pipeline Integration</strong><br>
The ultimate goal is to integrate the playbook into a pipeline, enabling automated execution through tools like Ansible Tower or ArgoCD. This would streamline the entire process and align it with a GitOps approach.</p>
</li>
<li>
<p><strong>Distributed Switch Integration</strong><br>
It’s also conceivable to extend the functionality by adding new VLANs directly to the distributed switch in my vSphere environment. However, this would be handled in a separate playbook to maintain modularity. The pipeline would then orchestrate both playbooks to ensure a seamless configuration process.</p>
</li>
</ol>
<p>By addressing these points, the project will become more robust, scalable, and aligned with modern automation practices.</p>
]]></content>
		</item>
		
		<item>
			<title>IPAM Automation with NetBox, Ansible, and Microsoft Windows DNS Server</title>
			<link>https://sdn-warrior.org/posts/ipam-automation/</link>
			<pubDate>Fri, 20 Dec 2024 02:00:04 +0100</pubDate>
			
			<guid>https://sdn-warrior.org/posts/ipam-automation/</guid>
			<description><![CDATA[IPAM Automation with Netbox and Ansible]]></description>
			<content type="html"><![CDATA[<h2 id="introduction">Introduction</h2>
<p>Managing IP addresses and DNS records manually can be a daunting task, especially in dynamic IT environments. This blog post demonstrates how to leverage NetBox, Ansible, and Microsoft Windows DNS Server to automate IP Address Management (IPAM) and DNS record updates, making your infrastructure more efficient and reliable.</p>
<h2 id="why-automate-ipam-and-dns">Why Automate IPAM and DNS?</h2>
<ul>
<li>Consistency: Automation minimizes human errors and ensures uniformity.</li>
<li>Efficiency: Automating repetitive tasks saves time and allows teams to focus on strategic activities.</li>
<li>Scalability: As networks grow, automated solutions adapt more easily than manual processes.</li>
</ul>
<h2 id="my-goal">My goal</h2>
<ul>
<li>Get a free IP address is dynamically fetched from a defined subnet in NetBox.</li>
<li>The IP address is immediately assigned to the specified FQDN in NetBox.</li>
<li>A corresponding Host A record is created in your Windows DNS Server.</li>
</ul>
<h2 id="prerequisites">Prerequisites</h2>
<p>Before diving into the implementation, ensure the following:</p>
<ul>
<li>A functional NetBox instance configured with appropriate IPAM data.</li>
<li>A Microsoft Windows DNS Server with administrative access.</li>
<li>Ansible installed and configured on a control node.</li>
<li>API access credentials for NetBox.</li>
<li>pywinrm Python module</li>
<li>PowerShell Remoting</li>
</ul>
<h2 id="ansible-project">Ansible Project</h2>
<p>For this automation project, I structured my workflow into multiple steps to keep it organized and modular. I use an ansible.cfg file to integrate and manage my inventory. At the core of the setup is a master playbook, which orchestrates the entire automation process.</p>
<p>To simplify and separate concerns, I divided the tasks into two sub-playbooks:</p>
<p>NetBox playbook: Handles all interactions with NetBox, such as fetching available IPs or updating DNS-related metadata.
DNS playbook: Focuses on managing DNS records on my Microsoft Windows DNS Server.
This approach not only makes the automation workflow easier to manage but also allows me to test and modify individual components independently while maintaining a clear overview of the entire process through the master playbook.</p>
<h2 id="getting-started">Getting Started</h2>
<p>To begin, I will list the files and their roles in this automation project. While these files are currently stored in my local Gitea instance, I’m considering creating a public Git repository for future projects to make them more accessible and easier to share.</p>
<h3 id="inventoryyml">inventory.yml</h3>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">all</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">hosts</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">dnsserver.lab.home</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">ansible_host</span><span class="p">:</span><span class="w"> </span><span class="l">dc.lab.home </span><span class="w"> </span><span class="c"># IP-Adresse or Hostname of Windows-DNS-Servers</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">ansible_user</span><span class="p">:</span><span class="w"> </span><span class="l">administrator </span><span class="w"> </span><span class="c"># Username </span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">ansible_password</span><span class="p">:</span><span class="w"> </span><span class="l">xxx </span><span class="w"> </span><span class="c"># Password</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">ansible_connection</span><span class="p">:</span><span class="w"> </span><span class="l">winrm </span><span class="w"> </span><span class="c"># connection</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">ansible_winrm_transport</span><span class="p">:</span><span class="w"> </span><span class="l">basic </span><span class="w"> </span><span class="c"># auth</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">ansible_winr_server_cert_validation</span><span class="p">:</span><span class="w"> </span><span class="l">ignore</span><span class="w"> </span><span class="c">#don&#39;t check the certificate</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">ansible_port</span><span class="p">:</span><span class="w"> </span><span class="m">5986</span><span class="w"> </span><span class="c">#winrm https port</span><span class="w">
</span></span></span></code></pre></div><h3 id="ansiblecfg">ansible.cfg</h3>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-ini" data-lang="ini"><span class="line"><span class="cl"><span class="k">[defaults]</span>
</span></span><span class="line"><span class="cl"><span class="na">inventory</span> <span class="o">=</span> <span class="s">inventory.yml</span>
</span></span></code></pre></div><h3 id="register_ipyml">register_ip.yml</h3>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nn">---</span><span class="w">
</span></span></span><span class="line"><span class="cl">- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Validate input variables</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">fail</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">msg</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;You must provide &#39;netbox_token&#39;, &#39;prefix&#39;, and &#39;dns_name&#39; as extra-vars.&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">when</span><span class="p">:</span><span class="w"> </span><span class="l">netbox_token == &#34;&#34; or prefix == &#34;&#34; or dns_name == &#34;&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl">- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Get the prefix ID from NetBox</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">uri</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">url</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;{{ netbox_url }}/api/ipam/prefixes/?prefix={{ prefix }}&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">method</span><span class="p">:</span><span class="w"> </span><span class="l">GET</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">headers</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">Authorization</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;Token {{ netbox_token }}&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">Accept</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;application/json&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">return_content</span><span class="p">:</span><span class="w"> </span><span class="kc">yes</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">register</span><span class="p">:</span><span class="w"> </span><span class="l">prefix_data</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl">- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Fail if the prefix does not exist</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">fail</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">msg</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;Prefix {{ prefix }} does not exist in NetBox.&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">when</span><span class="p">:</span><span class="w"> </span><span class="l">prefix_data.json.results | length == 0</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl">- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Get available IPs in the prefix</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">uri</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">url</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;{{ netbox_url }}/api/ipam/prefixes/{{ prefix_data.json.results[0].id }}/available-ips/&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">method</span><span class="p">:</span><span class="w"> </span><span class="l">GET</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">headers</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">Authorization</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;Token {{ netbox_token }}&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">Accept</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;application/json&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">return_content</span><span class="p">:</span><span class="w"> </span><span class="kc">yes</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">register</span><span class="p">:</span><span class="w"> </span><span class="l">available_ips</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl">- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Fail if no available IPs are found</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">fail</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">msg</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;No available IPs found in prefix {{ prefix }}.&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">when</span><span class="p">:</span><span class="w"> </span><span class="l">available_ips.json | length == 0</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl">- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Assign the first available IP</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">uri</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">url</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;{{ netbox_url }}/api/ipam/ip-addresses/&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">method</span><span class="p">:</span><span class="w"> </span><span class="l">POST</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">headers</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">Authorization</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;Token {{ netbox_token }}&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">Accept</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;application/json&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">Content-Type</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;application/json&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">body</span><span class="p">:</span><span class="w"> </span><span class="p">&gt;</span><span class="sd">
</span></span></span><span class="line"><span class="cl"><span class="sd">      {
</span></span></span><span class="line"><span class="cl"><span class="sd">        &#34;address&#34;: &#34;{{ available_ips.json[0].address }}&#34;,
</span></span></span><span class="line"><span class="cl"><span class="sd">        &#34;status&#34;: &#34;active&#34;,
</span></span></span><span class="line"><span class="cl"><span class="sd">        &#34;description&#34;: &#34;Created by Ansible&#34;,
</span></span></span><span class="line"><span class="cl"><span class="sd">        &#34;dns_name&#34;: &#34;{{ dns_name }}&#34;
</span></span></span><span class="line"><span class="cl"><span class="sd">      }</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">body_format</span><span class="p">:</span><span class="w"> </span><span class="l">json</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">status_code</span><span class="p">:</span><span class="w"> </span><span class="m">201</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">return_content</span><span class="p">:</span><span class="w"> </span><span class="kc">yes</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">register</span><span class="p">:</span><span class="w"> </span><span class="l">ip_assignment</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl">- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Extract host and zone from DNS name</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">set_fact</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">dns_host</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;{{ dns_name.split(&#39;.&#39;)[0] }}&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">dns_zone</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;{{ dns_name.split(&#39;.&#39;, 1)[1] }}&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">assigned_ip</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;{{ ip_assignment.json.address.split(&#39;/&#39;)[0] }}&#34;</span><span class="w">
</span></span></span></code></pre></div><h3 id="add_dns_recordyml">add_dns_record.yml</h3>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nn">---</span><span class="w">
</span></span></span><span class="line"><span class="cl">- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Add DNS A Record</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">win_shell</span><span class="p">:</span><span class="w"> </span><span class="p">|</span><span class="sd">
</span></span></span><span class="line"><span class="cl"><span class="sd">    Add-DnsServerResourceRecordA -Name &#34;{{ zdns_host }}&#34; -ZoneName &#34;{{ zdns_zone }}&#34; -IPv4Address &#34;{{ zassigned_ip }}&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">args</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">executable</span><span class="p">:</span><span class="w"> </span><span class="l">powershell</span><span class="w">
</span></span></span></code></pre></div><h3 id="mp_dnsyml-my-masterplaybook">mp_dns.yml (my masterplaybook)</h3>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nn">---</span><span class="w">
</span></span></span><span class="line"><span class="cl">- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Register IP in NetBox</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">hosts</span><span class="p">:</span><span class="w"> </span><span class="l">localhost</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">gather_facts</span><span class="p">:</span><span class="w"> </span><span class="kc">no</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">vars</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">prefix</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;{{ prefix }}&#34;</span><span class="w">  </span><span class="c"># variables</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">dns_name</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;{{ dns_name }}&#34;</span><span class="w">  </span><span class="c"># variables</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">netbox_url</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;http://netbox.lab.home&#34;</span><span class="w">  </span><span class="c">#NetBox-URL</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">netbox_token</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;xxx&#34;</span><span class="w">  </span><span class="c"># Ntebox API token</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">tasks</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Run NetBox IP Registration Playbook</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">include_tasks</span><span class="p">:</span><span class="w"> </span><span class="l">register_ip.yml</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">vars</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">prefix</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;{{ prefix }}&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">dns_name</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;{{ dns_name }}&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        
</span></span></span><span class="line"><span class="cl"><span class="w"> 
</span></span></span><span class="line"><span class="cl">- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Add DNS A Record</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">hosts</span><span class="p">:</span><span class="w"> </span><span class="l">dnsserver.lab.home</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">gather_facts</span><span class="p">:</span><span class="w"> </span><span class="kc">no</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">vars</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">ansible_winrm_server_cert_validation</span><span class="p">:</span><span class="w"> </span><span class="l">ignore</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">zassigned_ip</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;{{ hostvars[&#39;localhost&#39;][&#39;sip&#39;] }}&#34;</span><span class="w"> </span><span class="c"># variables</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">zdns_host</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;{{ hostvars[&#39;localhost&#39;][&#39;sdns&#39;] }}&#34;</span><span class="w">   </span><span class="c"># variables</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">zdns_zone</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;{{ hostvars[&#39;localhost&#39;][&#39;szone&#39;] }}&#34;</span><span class="w">  </span><span class="c"># variables</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">tasks</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Include DNS Record Playbook</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">include_tasks</span><span class="p">:</span><span class="w"> </span><span class="l">add_dns_record.yml</span><span class="w">
</span></span></span></code></pre></div><h2 id="how-the-playbooks-work">How the Playbooks Work</h2>
<p>The process is coordinated by a master playbook (mp_dns.yml) and relies on sub-playbooks for discrete tasks.</p>
<h3 id="master-playbook-mp_dnsyml">Master Playbook (mp_dns.yml)</h3>
<p>The master playbook serves as the central control file. It performs the following steps:</p>
<p>Registers an IP Address in NetBox: This step invokes the register_ip.yml sub-playbook to allocate an available IP address within a specified prefix and associate it with the given DNS name in NetBox.</p>
<ul>
<li>
<p>Sets Facts:
After obtaining the IP address and DNS details from NetBox, it uses set_fact to store these values in variables (sip, sdns, szone) for use in the next task.</p>
</li>
<li>
<p>Adds a DNS A Record:
The second phase connects to the DNS server and calls the add_dns_record.yml sub-playbook to create a DNS A record using the information retrieved from NetBox.</p>
</li>
</ul>
<h3 id="sub-playbook-register_ipyml">Sub-Playbook: register_ip.yml</h3>
<p>This playbook interacts with NetBox&rsquo;s API to:</p>
<ul>
<li>Validate input variables like the NetBox token, prefix, and DNS name.</li>
<li>Retrieve the prefix and find available IPs.</li>
<li>Assign the first available IP to the provided DNS name and register it in NetBox.</li>
</ul>
<p>The playbook sends a POST request to the NetBox API to assign an available IP address to the provided DNS name. The response is returned in JSON format and parsed to extract the necessary variables for the DNS record creation.</p>
<p>The JSON response is parsed to extract key values:</p>
<ul>
<li>dns_host and dns_zone are derived by splitting the FQDN.</li>
<li>assigned_ip captures the raw IP address, omitting the CIDR notation.</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="w">  </span><span class="nt">dns_host</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;{{ dns_name.split(&#39;.&#39;)[0] }}&#34;</span><span class="w">  </span><span class="c"># Extracts the hostname (e.g., &#34;myhost&#34; from &#34;myhost.lab.local&#34;)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">dns_zone</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;{{ dns_name.split(&#39;.&#39;, 1)[1] }}&#34;</span><span class="w">  </span><span class="c"># Extracts the zone (e.g., &#34;lab.local&#34;)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">assigned_ip</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;{{ ip_assignment.json.address.split(&#39;/&#39;)[0] }}&#34;</span><span class="w">  </span><span class="c"># Removes the subnet mask (e.g., &#34;192.168.1.10/24&#34; to &#34;192.168.1.10&#34;)</span><span class="w">
</span></span></span></code></pre></div><p>This parsing ensures the required details are extracted for creating the DNS record in subsequent tasks, linking NetBox&rsquo;s IP allocation to the DNS configuration seamlessly.</p>
<h3 id="sub-playbook-add_dns_recordyml">Sub-Playbook: add_dns_record.yml</h3>
<p>This playbook uses PowerShell (win_shell) to execute the Add-DnsServerResourceRecordA cmdlet on the Windows DNS server. It creates a DNS A record with the assigned IP, host, and zone.</p>
<h3 id="why-use-host_vars">Why Use host_vars?</h3>
<p>hostvars is a built-in Ansible variable that provides access to variables from other hosts in the inventory. This is particularly useful when you need to share or reference facts or variables gathered from one host on another host.
The NetBox-related tasks (e.g., registering IP addresses and extracting DNS details) are performed on localhost since they interact with external APIs and don’t require remote server execution.
Variables like sip, sdns, and szone are set as facts on localhost during the first phase of the playbook execution.
The <em><strong>hostvars[&rsquo;localhost&rsquo;]</strong></em> construct is used to retrieve these facts and make them available to the subsequent tasks running on the DNS server (dnsserver.lab.home).</p>
<h3 id="variable-assignments">Variable Assignments:</h3>
<ul>
<li>zassigned_ip: This retrieves the IP address (sip) assigned to the host from the NetBox interaction on localhost.</li>
<li>zdns_host: This extracts the host portion of the DNS name (sdns) derived from the FQDN split.</li>
<li>zdns_zone: This fetches the DNS zone (szone), also derived from the FQDN split.</li>
</ul>
<p>This approach ensures that:</p>
<ul>
<li>Data derived or computed in one phase (NetBox-related tasks) is seamlessly passed to the next phase (DNS-related tasks).</li>
<li>sThe DNS playbook (add_dns_record.yml) running on the DNS server has access to the correct IP, host, and zone information without redundant processing.</li>
</ul>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content"><p>One of the biggest challenges I faced in this project was understanding why I couldn’t directly use the variables returned from NetBox in the DNS-related tasks. I initially tried to pass these variables directly, but the playbook failed because the DNS tasks were executed on a different host (dnsserver.lab.home) than the one that retrieved the data (localhost).</p>
<p>The solution involved using hostvars to reference the facts set on localhost. This took the most time to figure out, as I didn’t immediately realize that variables gathered on one host are not automatically accessible on another. Once I understood how hostvars works, everything started to fall into place.</p>
</div>
    </aside>
<h2 id="ok-enough-code-and-explanations-lets-see-it-in-action">Ok, Enough Code and Explanations, Let’s See It in Action</h2>
<p>Starting the playbook:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">ansible-playbook mp_dns.yml -e &#34;prefix=192.168.2.0/24 dns_name=hello-world.lab.home&#34;
</span></span></code></pre></div><figure><a href="output.png"><picture><source srcset="/ipam-automation/output_hu_2c1feecce3ba3c36.png" type="image/png">
          <img
            src="/ipam-automation/output_hu_2c1feecce3ba3c36.png"alt="Ansible output"width="1003"
            height="610"/>
        </picture></a><figcaption><p>Ansible output (click to enlarge)</p></figcaption></figure>
<figure><a href="netbox.png"><picture><source srcset="/ipam-automation/netbox_hu_eb0a4b6e7b762802.png" type="image/png">
          <img
            src="/ipam-automation/netbox_hu_eb0a4b6e7b762802.png"alt="Netbox"width="1447"
            height="489"/>
        </picture></a><figcaption><p>Netbox (click to enlarge)</p></figcaption></figure>
<figure><a href="dns.png"><picture><source srcset="/ipam-automation/dns_hu_e5556e0784f477bb.png" type="image/png">
          <img
            src="/ipam-automation/dns_hu_e5556e0784f477bb.png"alt="DNS"width="404"
            height="455"/>
        </picture></a><figcaption><p>DNS (click to enlarge)</p></figcaption></figure>
<h2 id="conclusion">Conclusion</h2>
<p>This project represents just the first step toward a fully automated IPAM and DNS management workflow. While the current solution works well in my lab environment, there is plenty of room for improvement and expansion.</p>
<p>Key Takeaways:</p>
<ul>
<li>
<p>Modular Design: Starting with a modular playbook structure ensures flexibility for future enhancements and easier debugging.</p>
</li>
<li>
<p>Lab vs. Production: This setup is tailored for a lab environment. For production systems, avoid using highly privileged accounts like the local administrator on the DNS server. A more secure approach with role-based access control (RBAC) should be implemented in future iterations.</p>
</li>
<li>
<p>Continuous Improvement: I acknowledge that the playbook is not perfect. Over time, I plan to refine and optimize it, addressing any current shortcomings and making it more robust for complex workflows.</p>
</li>
</ul>
<p><em><strong>Automation is a journey</strong></em>, and I’m excited to see how this project evolves. Stay tuned for updates and new features in future versions!</p>
<h2 id="update--automatic-ptr-creation">Update:  automatic PTR creation</h2>
<p>Here’s a quick update to my blog: With the adjusted code, you can automatically create a PTR record when adding a Host A record. Note: The Reverse Lookup Zone must already exist.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nn">---</span><span class="w">
</span></span></span><span class="line"><span class="cl">- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Add DNS A Record</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">win_shell</span><span class="p">:</span><span class="w"> </span><span class="p">|</span><span class="sd">
</span></span></span><span class="line"><span class="cl"><span class="sd">    $ip = &#34;{{ zassigned_ip }}&#34;
</span></span></span><span class="line"><span class="cl"><span class="sd">    $hostname = &#34;{{ zdns_host }}.{{ zdns_zone }}&#34;
</span></span></span><span class="line"><span class="cl"><span class="sd">    $reverseZone = (&#34;{0}.{1}.{2}.in-addr.arpa&#34; -f $ip.Split(&#34;.&#34;)[2], $ip.Split(&#34;.&#34;)[1], $ip.Split(&#34;.&#34;)[0])
</span></span></span><span class="line"><span class="cl"><span class="sd">    Add-DnsServerResourceRecordA -Name &#34;{{ zdns_host }}&#34; -ZoneName &#34;{{ zdns_zone }}&#34; -IPv4Address &#34;{{ zassigned_ip }}&#34;
</span></span></span><span class="line"><span class="cl"><span class="sd">    Add-DnsServerResourceRecordPtr -ZoneName $reverseZone -Name ($ip.Split(&#34;.&#34;)[3]) -PtrDomainName &#34;$hostname.$zoneName&#34; </span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">args</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">executable</span><span class="p">:</span><span class="w"> </span><span class="l">powershell</span><span class="w">
</span></span></span></code></pre></div>]]></content>
		</item>
		
		<item>
			<title>From Zero to Automation: How I Used ChatGPT to Create My First Ansible Playbook</title>
			<link>https://sdn-warrior.org/posts/first-steps-ansible/</link>
			<pubDate>Tue, 17 Dec 2024 22:36:18 +0100</pubDate>
			
			<guid>https://sdn-warrior.org/posts/first-steps-ansible/</guid>
			<description><![CDATA[How I Used ChatGPT to Create My First Ansible Playbook]]></description>
			<content type="html"><![CDATA[<h2 id="introduction">Introduction</h2>
<p>I recently decided to automate the startup and shutdown of my lab environments—both standard and nested labs. While the idea sounded simple, it quickly turned into an interesting challenge. Having never written an Ansible Playbook before, I turned to ChatGPT for help.</p>
<h2 id="why-chatgpt">Why ChatGPT?</h2>
<p>Let’s be honest: starting with Ansible can feel overwhelming, especially if you&rsquo;re new to it. My last experience with something remotely similar was years ago, working with PowerShell scripts or even earlier with .NET 3 (yes, I’m &ldquo;that old&rdquo;).</p>
<p>The task itself seemed straightforward at first:</p>
<ul>
<li>Write a playbook to power VMs on and off in a controlled manner.</li>
<li>Integrate both my standard lab and nested lab (e.g., my VCF setup with its own vCenter).</li>
</ul>
<p>However, the challenge revealed itself quickly:</p>
<ul>
<li>Controlling VMs via my main vCenter is relatively easy.</li>
<li>But what about nested labs where each nested setup has its own vCenter?</li>
</ul>
<p>This is where ChatGPT became a game changer.</p>
<h2 id="the-approach">The Approach</h2>
<h3 id="starting-from-zero">Starting from Zero</h3>
<p>I described my setup and goals to ChatGPT:</p>
<p>Automate VM startup/shutdown.
Handle dependencies like nested vCenters that control their own VMs.
ChatGPT provided a clear starting point, explaining how to structure an Ansible playbook. Step by step, it introduced me to tasks, loops, and the required VMware modules.</p>
<h3 id="iterating-through-challenges">Iterating Through Challenges</h3>
<p>The major challenge was managing nested environments:</p>
<p>Powering on the parent vCenter first.
Waiting until it’s responsive.
Then triggering the startup sequence for the nested VMs managed by that vCenter.
Through multiple iterations, ChatGPT helped refine the logic.</p>
<h3 id="not-always-smooth-sailing">Not Always Smooth Sailing</h3>
<p>To be honest, ChatGPT’s suggestions weren’t always perfect. More than once, I found myself in a dead end. I had to point out repeatedly that the same solution, presented for the third time, simply didn’t work. This is the reality of working with AI: it doesn’t replace expertise, but it certainly accelerates the process.</p>
<p>While ChatGPT couldn’t solve everything on its own, it significantly simplified finding the right solution. Instead of starting from scratch or digging through documentation for hours, I could focus on testing and refining the playbook.</p>
<h2 id="current-progress-what-i-achieved-in-two-evenings">Current Progress: What I Achieved in Two Evenings</h2>
<p>After a couple of evenings, with a few hours of experimenting and iterating with ChatGPT, I managed to create four modular Ansible playbooks. These playbooks are designed to handle two key scenarios for starting and stopping VMs:</p>
<p>Two Playbooks for Environments with vCenter</p>
<p>These playbooks are for my standard (non-nested) lab environments, where I can rely on vCenter to manage the VMs.
With vCenter in place, controlling VMs is relatively straightforward, as vCenter provides a central interface to handle power states.
Two Playbooks for Environments without vCenter</p>
<p>These playbooks handle environments where no vCenter is available, such as nested labs or standalone ESXi hosts.
In nested labs, the challenge arises because VMs and their dependencies are controlled individually, without the convenience of a central management interface.
By separating the logic into modular playbooks, I ensured flexibility and reusability across my different lab setups. Whether I’m dealing with my regular homelab VMs or complex nested environments like my VCF setup, I can now efficiently start and stop VMs with a single command.</p>
<h3 id="inventory-files-the-backbone-of-the-setup">Inventory Files: The Backbone of the Setup</h3>
<p>To make the playbooks flexible and reusable, I created inventory YAML files for each lab. Out of habit, I named them something like vcfvm_vars.yml or vcfesx_vars.yml. These files act as the variable storage for each lab environment.</p>
<p>There are two types of inventory files:</p>
<p>For Nested VMs:</p>
<p>Includes variables specific to nested lab setups, such as nested vCenter credentials, VM names, and their dependencies.
For Non-Nested VMs:</p>
<p>Stores details for standard VMs managed directly via the main vCenter.</p>
<h3 id="nested-vcf-example-controlled-boot-and-shutdown">Nested VCF Example: Controlled Boot and Shutdown</h3>
<p>In my VCF setup, which is fully nested, the playbook must follow a strict sequence:</p>
<p>Startup:</p>
<p>Start the nested ESXi hosts first.
Wait for their availability.
Then start the nested management VMs, such as NSX Manager, SDDC Manager, and vCenter.
Shutdown:</p>
<p>Stop the management VMs first.
Once the management layer is powered down, shut down the nested ESXi hosts.
This controlled sequence ensures the nested environment behaves predictably.</p>
<h3 id="inventory-file-for-esxi-hosts">Inventory File for ESXi Hosts</h3>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl"># vcfesx_vars.yml
</span></span><span class="line"><span class="cl">vcenter_hostname: &#34;vcsa.lab.home&#34;
</span></span><span class="line"><span class="cl">vcenter_username: &#34;administrator@vsphere.local&#34;
</span></span><span class="line"><span class="cl">vcenter_password: &#34;your_pw&#34;
</span></span><span class="line"><span class="cl">vcenter_datacenter: &#34;Homelab&#34;
</span></span><span class="line"><span class="cl">validate_certs: false
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">vm_names:
</span></span><span class="line"><span class="cl">  - &#34;sfo01-m01-esx01&#34;
</span></span><span class="line"><span class="cl">  - &#34;sfo01-m01-esx02&#34;
</span></span><span class="line"><span class="cl">  - &#34;sfo01-m01-esx03&#34;
</span></span></code></pre></div><h3 id="inventory-file-for-nested-vms">Inventory File for Nested VMs</h3>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl"># vcfvm_vars.yml
</span></span><span class="line"><span class="cl">validate_certs: false
</span></span><span class="line"><span class="cl">esxi_hosts:
</span></span><span class="line"><span class="cl">  - &#34;sfo01-m01-esx01.lab.home&#34;
</span></span><span class="line"><span class="cl">  - &#34;sfo01-m01-esx02.lab.home&#34;
</span></span><span class="line"><span class="cl">  - &#34;sfo01-m01-esx03.lab.home&#34;
</span></span><span class="line"><span class="cl">esxi_username: &#34;root&#34;
</span></span><span class="line"><span class="cl">esxi_password: &#34;your_pw&#34;
</span></span><span class="line"><span class="cl">esxi_datacenter: &#34;sfo-m01-dc01&#34;
</span></span><span class="line"><span class="cl">vm_names:
</span></span><span class="line"><span class="cl">  - &#34;vcfvcsa&#34;
</span></span><span class="line"><span class="cl">  - &#34;vcfnsx01a&#34;
</span></span><span class="line"><span class="cl">  - &#34;vcf01&#34;
</span></span></code></pre></div><h3 id="power-on-playbook-for-non-nested-vms">Power-On Playbook for Non-Nested VMs</h3>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">---
</span></span><span class="line"><span class="cl">- name: Start specific VMs in vCenter
</span></span><span class="line"><span class="cl">  hosts: localhost
</span></span><span class="line"><span class="cl">  gather_facts: no
</span></span><span class="line"><span class="cl">  collections:
</span></span><span class="line"><span class="cl">    - community.vmware
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  tasks:
</span></span><span class="line"><span class="cl">    - name: Load variables from file
</span></span><span class="line"><span class="cl">      include_vars: &#34;{{ vars_file }}&#34;
</span></span><span class="line"><span class="cl">    - name: Connect to vCenter and start VMs
</span></span><span class="line"><span class="cl">      community.vmware.vmware_guest_powerstate:
</span></span><span class="line"><span class="cl">        hostname: &#34;{{ vcenter_hostname }}&#34;
</span></span><span class="line"><span class="cl">        username: &#34;{{ vcenter_username }}&#34;
</span></span><span class="line"><span class="cl">        password: &#34;{{ vcenter_password }}&#34;
</span></span><span class="line"><span class="cl">        validate_certs: &#34;{{ validate_certs }}&#34;
</span></span><span class="line"><span class="cl">        name: &#34;{{ item }}&#34;
</span></span><span class="line"><span class="cl">        state: powered-on
</span></span><span class="line"><span class="cl">      loop: &#34;{{ vm_names }}&#34;
</span></span><span class="line"><span class="cl">      register: power_state_result
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    - name: Display power state result
</span></span><span class="line"><span class="cl">      debug:
</span></span><span class="line"><span class="cl">        msg: &#34;VM {{ item.item }} wurde erfolgreich gestartet.&#34;
</span></span><span class="line"><span class="cl">      when: item.instance.hw_power_status == &#34;poweredOn&#34;
</span></span><span class="line"><span class="cl">      loop: &#34;{{ power_state_result.results }}&#34;
</span></span><span class="line"><span class="cl">      loop_control:
</span></span><span class="line"><span class="cl">        label: &#34;{{ item.item }}&#34;
</span></span></code></pre></div><ol>
<li>
<p><em><strong>include_vars:</strong></em> Loads a variable file, such as vcfvm_vars.yml, which makes the playbook modular and reusable.</p>
</li>
<li>
<p><em><strong>community.vmware.vmware_guest_powerstate:</strong></em> Uses the <em><strong>vmware_guest_powerstate</strong></em> module to control the power state of VMs in a vCenter-managed environment.</p>
</li>
<li>
<p><em><strong>The state:</strong></em> powered-on option ensures VMs are powered on.</p>
</li>
<li>
<p><em><strong>register: power_state_result:</strong></em> Captures the result of the task execution for each VM, including its power state.</p>
</li>
<li>
<p><em><strong>debug with when:</strong></em> Checks the power state of each VM and displays a success message if the VM was successfully powered on.</p>
</li>
</ol>
<h3 id="power-on-playbook-for-nested-vms-on-multiple-esxi-hosts">Power-On Playbook for Nested VMs on Multiple ESXi Hosts</h3>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">---
</span></span><span class="line"><span class="cl">- name: Power on multiple VMs on multiple ESXi hosts
</span></span><span class="line"><span class="cl">  hosts: localhost
</span></span><span class="line"><span class="cl">  gather_facts: no
</span></span><span class="line"><span class="cl">  collections:
</span></span><span class="line"><span class="cl">    - community.vmware
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  tasks:
</span></span><span class="line"><span class="cl">    - name: Load variables from file
</span></span><span class="line"><span class="cl">      include_vars: &#34;{{ vars_file }}&#34;
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    - name: Get VM power status for each VM on each ESXi host
</span></span><span class="line"><span class="cl">      community.vmware.vmware_guest_info:
</span></span><span class="line"><span class="cl">        hostname: &#34;{{ item.0 }}&#34;
</span></span><span class="line"><span class="cl">        username: &#34;{{ esxi_username }}&#34;
</span></span><span class="line"><span class="cl">        password: &#34;{{ esxi_password }}&#34;
</span></span><span class="line"><span class="cl">        datacenter: &#34;{{ esxi_datacenter }}&#34;
</span></span><span class="line"><span class="cl">        validate_certs: &#34;{{ validate_certs }}&#34;
</span></span><span class="line"><span class="cl">        name: &#34;{{ item.1 }}&#34;
</span></span><span class="line"><span class="cl">      with_nested:
</span></span><span class="line"><span class="cl">        - &#34;{{ esxi_hosts }}&#34;
</span></span><span class="line"><span class="cl">        - &#34;{{ vm_names }}&#34;
</span></span><span class="line"><span class="cl">      register: vm_info_results
</span></span><span class="line"><span class="cl">      ignore_errors: true
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    - name: Filter VMs that are poweredOff
</span></span><span class="line"><span class="cl">      set_fact:
</span></span><span class="line"><span class="cl">        powered_off_vms: &#34;{{ vm_info_results.results | selectattr(&#39;failed&#39;, &#39;equalto&#39;, false)
</span></span><span class="line"><span class="cl">                           | selectattr(&#39;instance.hw_power_status&#39;, &#39;equalto&#39;, &#39;poweredOff&#39;) }}&#34;
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    - name: Power on VMs if they are poweredOff
</span></span><span class="line"><span class="cl">      community.vmware.vmware_guest_powerstate:
</span></span><span class="line"><span class="cl">        hostname: &#34;{{ item.item.0 }}&#34;
</span></span><span class="line"><span class="cl">        username: &#34;{{ esxi_username }}&#34;
</span></span><span class="line"><span class="cl">        password: &#34;{{ esxi_password }}&#34;
</span></span><span class="line"><span class="cl">        datacenter: &#34;{{ esxi_datacenter }}&#34;
</span></span><span class="line"><span class="cl">        validate_certs: &#34;{{ validate_certs }}&#34;
</span></span><span class="line"><span class="cl">        name: &#34;{{ item.item.1 }}&#34;
</span></span><span class="line"><span class="cl">        state: powered-on
</span></span><span class="line"><span class="cl">      loop: &#34;{{ powered_off_vms }}&#34;
</span></span><span class="line"><span class="cl">      loop_control:
</span></span><span class="line"><span class="cl">        label: &#34;Host: {{ item.item.0 }} | VM: {{ item.item.1 }}&#34;
</span></span><span class="line"><span class="cl">      register: poweron_results
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    - name: Wait for VMs to be powered on
</span></span><span class="line"><span class="cl">      community.vmware.vmware_guest_info:
</span></span><span class="line"><span class="cl">        hostname: &#34;{{ item.item.item.0 }}&#34;
</span></span><span class="line"><span class="cl">        username: &#34;{{ esxi_username }}&#34;
</span></span><span class="line"><span class="cl">        password: &#34;{{ esxi_password }}&#34;
</span></span><span class="line"><span class="cl">        datacenter: &#34;{{ esxi_datacenter }}&#34;
</span></span><span class="line"><span class="cl">        validate_certs: &#34;{{ validate_certs }}&#34;
</span></span><span class="line"><span class="cl">        name: &#34;{{ item.item.item.1 }}&#34;
</span></span><span class="line"><span class="cl">      loop: &#34;{{ poweron_results.results }}&#34;
</span></span><span class="line"><span class="cl">      loop_control:
</span></span><span class="line"><span class="cl">        label: &#34;Host: {{ item.item.item.0 }} | VM: {{ item.item.item.1 }}&#34;
</span></span><span class="line"><span class="cl">      register: vm_status
</span></span><span class="line"><span class="cl">      until: vm_status.instance.hw_power_status == &#34;poweredOn&#34;
</span></span><span class="line"><span class="cl">      retries: 20
</span></span><span class="line"><span class="cl">      delay: 15
</span></span><span class="line"><span class="cl">      when: item.failed == false
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    - name: Display power on result
</span></span><span class="line"><span class="cl">      debug:
</span></span><span class="line"><span class="cl">        msg: &#34;VM {{ item.item.item.1 }} on Host {{ item.item.item.0 }} has been successfully powered on.&#34;
</span></span><span class="line"><span class="cl">      loop: &#34;{{ poweron_results.results }}&#34;
</span></span><span class="line"><span class="cl">      loop_control:
</span></span><span class="line"><span class="cl">        label: &#34;Host: {{ item.item.item.0 }} | VM: {{ item.item.item.1 }}&#34;
</span></span></code></pre></div><ol>
<li>
<p><em><strong>vmware_guest_info</strong></em> Retrieves the power state of each VM on each ESXi host.</p>
</li>
<li>
<p><em><strong>set_fact</strong></em> Filters out only those VMs that are powered off.</p>
</li>
<li>
<p><em><strong>vmware_guest_powerstate</strong></em> Powers on each VM that is in a &ldquo;poweredOff&rdquo; state.</p>
</li>
<li>
<p><em><strong>wait_for with retries</strong></em> Ensures that the VMs are fully powered on before proceeding.</p>
</li>
<li>
<p><em><strong>debug</strong></em> Displays a confirmation message for each successfully powered-on VM.</p>
</li>
</ol>
<h3 id="master-playbook">Master Playbook</h3>
<p>to orchestrate the two Power-On playbooks in the correct order. I kept your current 60-second pause timer as a placeholder for checking ESXi server readiness but structured everything neatly for clarity. A 60-second pause ensures that the ESXi hosts have enough time to initialize. Why a Pause? Without an active feedback mechanism to confirm the ESXi servers are ready, this static wait acts as a temporary workaround and will replaced later.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">---
</span></span><span class="line"><span class="cl">- name: Power on Nested ESXi Hosts
</span></span><span class="line"><span class="cl">  import_playbook: poweron_vcsa.yml
</span></span><span class="line"><span class="cl">  vars:
</span></span><span class="line"><span class="cl">    vars_file: &#34;vcfesx_vars.yml&#34;
</span></span><span class="line"><span class="cl">  # Executes the playbook to power on the nested ESXi hosts.
</span></span><span class="line"><span class="cl">  # Variables specific to ESXi servers are loaded from &#34;vcfesx_vars.yml&#34;.
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">- name: Wait for 60 seconds before powering on nested VMs
</span></span><span class="line"><span class="cl">  hosts: localhost
</span></span><span class="line"><span class="cl">  gather_facts: no
</span></span><span class="line"><span class="cl">  tasks:
</span></span><span class="line"><span class="cl">    - name: Pause for 60 seconds
</span></span><span class="line"><span class="cl">      pause:
</span></span><span class="line"><span class="cl">        seconds: 60
</span></span><span class="line"><span class="cl">      # A static wait time to ensure ESXi hosts are ready.
</span></span><span class="line"><span class="cl">      # This will be improved in the future with dynamic checks.
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">- name: Power on Nested Management VMs
</span></span><span class="line"><span class="cl">  import_playbook: poweron_esx.yml
</span></span><span class="line"><span class="cl">  vars:
</span></span><span class="line"><span class="cl">    vars_file: &#34;vcfvm_vars.yml&#34;
</span></span><span class="line"><span class="cl">  # Executes the playbook to power on nested VMs like NSX Manager, SDDC Manager, and vCenter.
</span></span><span class="line"><span class="cl">  # Variables specific to management VMs are loaded from &#34;vcfvm_vars.yml&#34;.
</span></span></code></pre></div><h3 id="starting-the-vms-via-the-ansible-master-playbook">Starting the VMs via the Ansible Master Playbook</h3>
<p>Starting my VCF nested lab has never been easier. With the Ansible Master Playbook, it’s as simple as running a single command on my Ansible server:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">ansible-playbook mp_poweron_vcf.yml
</span></span></code></pre></div><p>Within approximately 5-10 minutes (depending on the overall load on my lab), the entire VCF environment is up and ready to use—without any further manual intervention.</p>
<p>The beauty of this setup lies in its flexibility:</p>
<p>New labs can be easily added by simply creating a new inventory file and a customized master playbook.
The core logic remains untouched, making it a scalable and modular solution for automating additional environments.
This approach not only saves time but also ensures consistency when starting up complex nested labs like my VCF setup.</p>
<figure><a href="ansible.png"><picture><source srcset="/first-steps-ansible/ansible_hu_45c1f31486bb7d7e.png" type="image/png">
          <img
            src="/first-steps-ansible/ansible_hu_45c1f31486bb7d7e.png"alt="Ansible Log"width="1718"
            height="1056"/>
        </picture></a><figcaption><p>Ansible Output (click to enlarge)</p></figcaption></figure>
<p>The log output of my Ansible playbook contains failed messages during the task: Get VM power status for each VM on each ESXi host
These failures occur because each ESXi host is queried for specific VMs (like vcf01) that may not exist on that particular host. This is both normal and expected behavior.</p>
<p>Why?
Due to DRS (Distributed Resource Scheduler), I can never be certain which nested ESXi host a particular VM was last running on. By iterating through all ESXi hosts, the playbook ensures that the power status of every VM is eventually retrieved, regardless of where it was previously located.</p>
<h3 id="shutdown-playbook-graceful-power-off-of-vms">Shutdown Playbook: Graceful Power-Off of VMs</h3>
<p>The shutdown process follows the same principles as the power-on playbook but in reverse order. Instead of starting VMs, it ensures a graceful shutdown while verifying their power state. I won&rsquo;t describe every task in detail, but here’s a quick overview:</p>
<p>Logic Similar to Power-On:</p>
<ul>
<li>VMs are iterated across multiple ESXi hosts.</li>
<li>Only VMs that are currently powered on are gracefully shut down.</li>
</ul>
<p>Graceful Shutdown with Validation:</p>
<ul>
<li>VMs are shut down using shutdown-guest to trigger the guest OS shutdown process.</li>
<li>A retry loop with retries: 20 and delay: 15 ensures that the playbook actively checks until the VMs reach the poweredOff state.</li>
</ul>
<p>Harmless Errors Handled:</p>
<ul>
<li>As with the power-on playbook, the ignore_errors: true directive handles expected failures gracefully (e.g., querying for VMs on ESXi hosts where they are not located).</li>
</ul>
<h3 id="shutdown-nested-vms">Shutdown Nested VMs</h3>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">- name: Graceful shutdown of multiple VMs on multiple ESXi hosts
</span></span><span class="line"><span class="cl">  hosts: localhost
</span></span><span class="line"><span class="cl">  gather_facts: no
</span></span><span class="line"><span class="cl">  collections:
</span></span><span class="line"><span class="cl">    - community.vmware
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  tasks:
</span></span><span class="line"><span class="cl">    - name: Load variables from file
</span></span><span class="line"><span class="cl">      include_vars: &#34;{{ vars_file }}&#34;
</span></span><span class="line"><span class="cl">      
</span></span><span class="line"><span class="cl">    - name: Get VM power status for each VM on each ESXi host
</span></span><span class="line"><span class="cl">      community.vmware.vmware_guest_info:
</span></span><span class="line"><span class="cl">        hostname: &#34;{{ item.0 }}&#34;
</span></span><span class="line"><span class="cl">        username: &#34;{{ esxi_username }}&#34;
</span></span><span class="line"><span class="cl">        password: &#34;{{ esxi_password }}&#34;
</span></span><span class="line"><span class="cl">        datacenter: &#34;{{ esxi_datacenter }}&#34;
</span></span><span class="line"><span class="cl">        validate_certs: &#34;{{ validate_certs }}&#34;
</span></span><span class="line"><span class="cl">        name: &#34;{{ item.1 }}&#34;
</span></span><span class="line"><span class="cl">      with_nested:
</span></span><span class="line"><span class="cl">        - &#34;{{ esxi_hosts }}&#34;
</span></span><span class="line"><span class="cl">        - &#34;{{ vm_names }}&#34;
</span></span><span class="line"><span class="cl">      register: vm_info_results
</span></span><span class="line"><span class="cl">      ignore_errors: true
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    - name: Filter VMs that are poweredOn
</span></span><span class="line"><span class="cl">      set_fact:
</span></span><span class="line"><span class="cl">        powered_on_vms: &#34;{{ vm_info_results.results | selectattr(&#39;failed&#39;, &#39;equalto&#39;, false)
</span></span><span class="line"><span class="cl">                           | selectattr(&#39;instance.hw_power_status&#39;, &#39;equalto&#39;, &#39;poweredOn&#39;) }}&#34;
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    - name: Shut down VMs if they are poweredOn
</span></span><span class="line"><span class="cl">      community.vmware.vmware_guest_powerstate:
</span></span><span class="line"><span class="cl">        hostname: &#34;{{ item.item.0 }}&#34;
</span></span><span class="line"><span class="cl">        username: &#34;{{ esxi_username }}&#34;
</span></span><span class="line"><span class="cl">        password: &#34;{{ esxi_password }}&#34;
</span></span><span class="line"><span class="cl">        datacenter: &#34;{{ esxi_datacenter }}&#34;
</span></span><span class="line"><span class="cl">        validate_certs: &#34;{{ validate_certs }}&#34;
</span></span><span class="line"><span class="cl">        name: &#34;{{ item.item.1 }}&#34;
</span></span><span class="line"><span class="cl">        state: shutdown-guest
</span></span><span class="line"><span class="cl">        force: false
</span></span><span class="line"><span class="cl">      loop: &#34;{{ powered_on_vms }}&#34;
</span></span><span class="line"><span class="cl">      loop_control:
</span></span><span class="line"><span class="cl">        label: &#34;Host: {{ item.item.0 }} | VM: {{ item.item.1 }}&#34;
</span></span><span class="line"><span class="cl">      register: shutdown_results
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    - name: Wait for VMs to be powered off
</span></span><span class="line"><span class="cl">      community.vmware.vmware_guest_info:
</span></span><span class="line"><span class="cl">        hostname: &#34;{{ item.item.item.0 }}&#34;
</span></span><span class="line"><span class="cl">        username: &#34;{{ esxi_username }}&#34;
</span></span><span class="line"><span class="cl">        password: &#34;{{ esxi_password }}&#34;
</span></span><span class="line"><span class="cl">        datacenter: &#34;{{ esxi_datacenter }}&#34;
</span></span><span class="line"><span class="cl">        validate_certs: &#34;{{ validate_certs }}&#34;
</span></span><span class="line"><span class="cl">        name: &#34;{{ item.item.item.1 }}&#34;
</span></span><span class="line"><span class="cl">      loop: &#34;{{ shutdown_results.results }}&#34;
</span></span><span class="line"><span class="cl">      loop_control:
</span></span><span class="line"><span class="cl">        label: &#34;Host: {{ item.item.item.0 }} | VM: {{ item.item.item.1 }}&#34;
</span></span><span class="line"><span class="cl">      register: vm_status
</span></span><span class="line"><span class="cl">      until: vm_status.instance.hw_power_status == &#34;poweredOff&#34;
</span></span><span class="line"><span class="cl">      retries: 20
</span></span><span class="line"><span class="cl">      delay: 15
</span></span><span class="line"><span class="cl">      when: item.failed == false
</span></span></code></pre></div><h3 id="shutdown-playbook-for-virtual-esxi-servers-using-vcenter">Shutdown Playbook for Virtual ESXi Servers Using vCenter</h3>
<p>This playbook is very similar to the nested VM shutdown playbook, but since I can rely on the vCenter, I don’t need to iterate through all ESXi servers. This simplifies the process and improves efficiency.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">---
</span></span><span class="line"><span class="cl">- name: Graceful shutdown of specific VMs if powered on
</span></span><span class="line"><span class="cl">  hosts: localhost
</span></span><span class="line"><span class="cl">  gather_facts: no
</span></span><span class="line"><span class="cl">  collections:
</span></span><span class="line"><span class="cl">    - community.vmware
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  tasks:
</span></span><span class="line"><span class="cl">    - name: Load variables from file
</span></span><span class="line"><span class="cl">      include_vars: &#34;{{ vars_file }}&#34;
</span></span><span class="line"><span class="cl">    - name: Get VM information
</span></span><span class="line"><span class="cl">      community.vmware.vmware_guest_info:
</span></span><span class="line"><span class="cl">        hostname: &#34;{{ vcenter_hostname }}&#34;
</span></span><span class="line"><span class="cl">        username: &#34;{{ vcenter_username }}&#34;
</span></span><span class="line"><span class="cl">        password: &#34;{{ vcenter_password }}&#34;
</span></span><span class="line"><span class="cl">        datacenter: &#34;{{ vcenter_datacenter }}&#34;
</span></span><span class="line"><span class="cl">        validate_certs: &#34;{{ validate_certs }}&#34;
</span></span><span class="line"><span class="cl">        name: &#34;{{ item }}&#34;
</span></span><span class="line"><span class="cl">      loop: &#34;{{ vm_names }}&#34;
</span></span><span class="line"><span class="cl">      register: vm_info_results
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    - name: Shut down VMs gracefully if powered on
</span></span><span class="line"><span class="cl">      community.vmware.vmware_guest_powerstate:
</span></span><span class="line"><span class="cl">        hostname: &#34;{{ vcenter_hostname }}&#34;
</span></span><span class="line"><span class="cl">        username: &#34;{{ vcenter_username }}&#34;
</span></span><span class="line"><span class="cl">        password: &#34;{{ vcenter_password }}&#34;
</span></span><span class="line"><span class="cl">        validate_certs: &#34;{{ validate_certs }}&#34;
</span></span><span class="line"><span class="cl">        name: &#34;{{ item.item }}&#34;
</span></span><span class="line"><span class="cl">        state: shutdown-guest
</span></span><span class="line"><span class="cl">        force: false
</span></span><span class="line"><span class="cl">      when: item.instance.hw_power_status == &#34;poweredOn&#34;
</span></span><span class="line"><span class="cl">      loop: &#34;{{ vm_info_results.results }}&#34;
</span></span><span class="line"><span class="cl">      register: shutdown_results
</span></span><span class="line"><span class="cl">      loop_control:
</span></span><span class="line"><span class="cl">        label: &#34;{{ item.item }}&#34;
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    - name: Wait for VMs to be powered off
</span></span><span class="line"><span class="cl">      community.vmware.vmware_guest_info:
</span></span><span class="line"><span class="cl">        hostname: &#34;{{ vcenter_hostname }}&#34;
</span></span><span class="line"><span class="cl">        username: &#34;{{ vcenter_username }}&#34;
</span></span><span class="line"><span class="cl">        password: &#34;{{ vcenter_password }}&#34;
</span></span><span class="line"><span class="cl">        datacenter: &#34;{{ vcenter_datacenter }}&#34;
</span></span><span class="line"><span class="cl">        validate_certs: &#34;{{ validate_certs }}&#34;
</span></span><span class="line"><span class="cl">        name: &#34;{{ item.item }}&#34;
</span></span><span class="line"><span class="cl">      register: vm_status
</span></span><span class="line"><span class="cl">      until: vm_status.instance.hw_power_status == &#34;poweredOff&#34;
</span></span><span class="line"><span class="cl">      retries: 20
</span></span><span class="line"><span class="cl">      delay: 15
</span></span><span class="line"><span class="cl">      loop: &#34;{{ vm_info_results.results }}&#34;
</span></span><span class="line"><span class="cl">      when: item.instance.hw_power_status == &#34;poweredOn&#34;
</span></span><span class="line"><span class="cl">      loop_control:
</span></span><span class="line"><span class="cl">        label: &#34;{{ item.item }}&#34;
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    - name: Display shutdown result
</span></span><span class="line"><span class="cl">      debug:
</span></span><span class="line"><span class="cl">        msg: &#34;VM {{ item.item }} ist erfolgreich heruntergefahren oder war bereits ausgeschaltet.&#34;
</span></span><span class="line"><span class="cl">      loop: &#34;{{ vm_info_results.results }}&#34;
</span></span><span class="line"><span class="cl">      loop_control:
</span></span><span class="line"><span class="cl">        label: &#34;{{ item.item }}&#34;
</span></span></code></pre></div><p>Use of vCenter:</p>
<ul>
<li>
<p>The playbook uses vCenter directly to manage the shutdown process, which avoids manually iterating through all ESXi hosts.
Graceful Shutdown:</p>
</li>
<li>
<p>The shutdown-guest option triggers a clean shutdown of the guest operating system running on the virtual ESXi servers.
Dynamic Verification:</p>
</li>
<li>
<p>The playbook dynamically filters the powered-on ESXi VMs and waits until their power state is confirmed as poweredOff.
Efficiency:</p>
</li>
<li>
<p>By leveraging vCenter and a loop with retries, the process is both clean and efficient.</p>
</li>
</ul>
<h3 id="master-shutdown-playbook">Master Shutdown Playbook</h3>
<p>To orchestrate the shutdown of the nested VCF lab and its virtual ESXi servers, we’ll create a master playbook similar to the Power-On master playbook. The inventory files remain the same as those used for the Power-On process, ensuring consistency and avoiding duplication.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">- name: Poweroff Nested VMs   
</span></span><span class="line"><span class="cl">  import_playbook: shutdown_esx.yml
</span></span><span class="line"><span class="cl">  vars:
</span></span><span class="line"><span class="cl">    vars_file: &#34;vcfvm_vars.yml&#34;
</span></span><span class="line"><span class="cl">- name: Poweroff Nested ESXi
</span></span><span class="line"><span class="cl">  import_playbook: shutdown_vcsa.yml
</span></span><span class="line"><span class="cl">  vars:
</span></span><span class="line"><span class="cl">    vars_file: &#34;vcfesx_vars.yml&#34;
</span></span></code></pre></div><p>Unlike the Power-On master playbook, the shutdown process does not require a pause or workaround. This is because during the shutdown, we can actively check if the respective VMs have already powered off using a loop. This makes the process cleaner and more efficient.</p>

    <aside class="admonition attention">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" class="feather feather-link" width="24" height="24" viewBox="0 0 24 24"
      fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
      <path d="M10 13a5 5 0 0 0 7.54.54l3-3a5 5 0 0 0-7.07-7.07l-1.72 1.71"></path>
      <path d="M14 11a5 5 0 0 0-7.54-.54l-3 3a5 5 0 0 0 7.07 7.07l1.71-1.71"></path>
   </svg></div><b>Attention</b>
        </div>
        <div class="admonition-content"><p>The playbooks presented in this article were generated with the help of AI and subsequently adjusted to work in my specific environment. While they function as intended for my use case, I strongly recommend exercising caution and thoroughly testing these playbooks in your own environment before implementing or relying on them.</p>
<p>Automation can be powerful, but every infrastructure is unique—always test in a controlled setting first!</p>
</div>
    </aside>
<h2 id="conclusion-is-chatgpt-useful-for-ansible">Conclusion: Is ChatGPT Useful for Ansible?</h2>
<p>From my perspective, the answer is both yes and no.</p>
<p>ChatGPT gave me a solid starting point and explained a lot of the foundational concepts, which was extremely helpful as a beginner with Ansible. However, it wasn’t perfect—there were several significant errors in the generated playbooks, and more than once, the AI proposed the same incorrect solution repeatedly.</p>
<p>Despite these challenges, I still found the process enjoyable. With some manual corrections and adjustments, I was able to create playbooks that worked for my specific environment. Within just a few hours, I achieved a usable result—something that would have taken considerably longer without ChatGPT&rsquo;s assistance.</p>
<p>Ultimately, while ChatGPT cannot replace expertise or thorough testing, it’s a powerful tool to accelerate development and simplify learning, especially when working with automation tools like Ansible.</p>
]]></content>
		</item>
		
		<item>
			<title>Homelab V5</title>
			<link>https://sdn-warrior.org/posts/homelab-v5/</link>
			<pubDate>Sat, 14 Dec 2024 02:00:26 +0100</pubDate>
			
			<guid>https://sdn-warrior.org/posts/homelab-v5/</guid>
			<description><![CDATA[My Homelab Journey: From Unraid Beginnings to Version 5]]></description>
			<content type="html"><![CDATA[<h2 id="my-homelab-journey-from-unraid-beginnings-to-version-5">My Homelab Journey: From Unraid Beginnings to Version 5</h2>
<p>Building and optimizing a homelab has always been a passion of mine. Since its inception, my homelab has gone through several iterations, constantly evolving to meet my goals of achieving maximum performance while minimizing power consumption, noise, and physical space requirements. Here is a snapshot of my journey, culminating in the current Version 5 of my homelab.</p>
<h2 id="the-beginning-unraid-with-custom-hardware">The Beginning: Unraid with Custom Hardware</h2>
<p>My homelab journey began with a custom-built Unraid server featuring an Intel i3 11th Generation processor and 64 GB of RAM. This setup acted as an all-in-one solution for storage, virtualization, and container workloads. I even conducted simple nested vSphere tests on this server during its early days. Today, the server is still in use as a storage and Docker host, although I have replaced the underlying hardware four times to keep up with evolving requirements.</p>
<figure><picture><source srcset="/labv5/unraid_hu_dc6535ee95ca5f24.jpeg" type="image/jpeg">
          <img
            src="/labv5/unraid_hu_dc6535ee95ca5f24.jpeg"alt="Homelab v1"width="960"
            height="1280"/>
        </picture><figcaption><p>My first Homelab</p></figcaption></figure>
<p>The rack that I used from Version 1 through Version 4 of my homelab housed only 2 switches, a Pi3 and an old HP Elitedesk Client in V1, but it had to be replaced to accommodate the changes in Version 5.</p>
<h2 id="evolution-to-version-5">Evolution to Version 5</h2>
<p>Over the years, I continuously refined and upgraded the homelab. With my role at Evoila GmbH, my expectations for both myself and my homelab grew significantly. It quickly became clear that I needed different hardware to meet these new demands, especially as I aimed to conduct more extensive labs with NSX.</p>
<p>To start, I added a simple 3-node NUC cluster using 11th Generation Intel i5 processors. Additionally, I replaced the switches in my setup with multiple multispeed switches from Zyxel and QNAP. At the time, there were limited options on the market for 2.5 Gbps switches with management capabilities, resulting in a somewhat heterogeneous configuration.</p>
<p>Each iteration brought new hardware, better software configurations, and more ambitious goals. Over time, more technologies found their way into my lab, including a Fortinet FortiGate F40, BGP routing, and 10G switches. These advancements eventually culminated in Version 4 of my lab. However, as the lab grew, the rack ran out of space for further development, prompting the need for a complete rebuild, which led to the creation of Version 5. Now, in its fifth version, the lab has transformed into a powerful and efficient setup comprising.</p>
<h2 id="my-lab-philosophy">My Lab Philosophy</h2>
<p>My primary goal has always been to achieve the best possible performance with minimal power consumption, noise, and space requirements. To this end, I have standardized my homelab on Intel’s 13th Generation CPUs, which strike a great balance between power efficiency and computational capability.</p>
<h2 id="lab-overview">Lab Overview</h2>
<p>In Lab Version 5, I have three clusters:</p>
<h3 id="management-cluster">Management Cluster:</h3>
<p>This cluster is powered by an Intel NUC i3 13th Generation, which serves as the always-on management node. The ESXi server in this cluster hosts several key VMs:</p>
<ul>
<li>A Windows 11 VM with tools like Hugo, Go, and GitHub for managing this blog.</li>
<li>LogInsight and FortiAnalyzer.</li>
<li>A vCenter server.</li>
<li>An mDNS Repeater for Smart Home integration.</li>
<li>Homebridge for managing smart devices.</li>
<li>NetBox for network documentation.</li>
<li>A Veeam server for backups.</li>
</ul>
<p>Additionally, two VMs on my Unraid server contribute to the management cluster:</p>
<ul>
<li>A Root CA.</li>
<li>A Domain Controller, which primarily supports the labs.</li>
</ul>
<p>The Unraid server also runs several containers, including:</p>
<ul>
<li>DNS servers.</li>
<li>An Excalidraw instance.</li>
<li>Various other tools.</li>
</ul>
<p>For redundancy, I run a backup DNS server on a Raspberry Pi 3 to ensure DNS functionality during storage maintenance. I use AdGuard Home as my primary DNS server, which blocks ads and forwards DNS queries to my lab.home domain managed by the Active Directory server.</p>
<h3 id="compute-cluster">Compute Cluster:</h3>
<p>My compute cluster consists of three Intel NUCs of the 13th Generation, each equipped with 64 GB of RAM and a 2TB NVMe drive. Due to the P/E core architecture, these NUCs do not support Hyperthreading but offer 12 cores (4P + 8E). Based on my experience, the performance with E cores enabled is better than using 4 P cores with Hyperthreading. Each NUC features dual 2.5G network interfaces and is connected to my iSCSI storage. This cluster runs standard nested labs, such as my NSX lab and AVI load balancer labs. The performance is sufficient for many labs, making it a reliable and frequently used part of my setup.</p>
<p>The Compute Cluster in Lab Version 5 has the following total resources:</p>
<ul>
<li>192 GB RAM across 3 NUCs</li>
<li>6 TB NVMe storage (2 TB per NUC)</li>
<li>8 TB shared storage</li>
<li>36 CPU cores (12 cores per NUC)</li>
</ul>
<h3 id="performance-cluster">Performance Cluster:</h3>
<p>My performance cluster consists of four MinisForum MS-01 units, each featuring an Intel i9 processor with 14 cores (6P + 8E), 64 GB of physical RAM, and a 400% memory tiering configuration. Each MS-01 includes 2 TB of local NVMe storage and an additional 1 TB PCIe4 NVMe drive for memory tiering. With onboard dual 10GbE networking, the MS-01 units are ideal for demanding labs, such as a complete VCF deployment including an HCX proof of concept where I live-migrated VMs between my NSX lab and the VCF lab. The MS-01 units are also used for vSAN labs. Additionally, Intel vPro support allows for efficient remote management. This cluster provides:</p>
<ul>
<li>56 CPU cores (14 cores per MS-01).</li>
<li>1280 GB of RAM (64 GB physical per unit with memory tiering).</li>
<li>8 TB of local NVMe storage (2 TB per unit).</li>
<li>4 TB of NVMe storage for memory tiering (1 TB per unit).</li>
<li>8 TB shared Storage</li>
</ul>
<h3 id="network">Network</h3>
<p>My network consists of multiple MikroTik switches. The centerpiece is my ToR (Top of Rack) switch, the MikroTik CRS309, which can route at line speed thanks to hardware offloading. This switch hosts all lab-relevant gateways and networks, ensuring they don&rsquo;t need to be routed through my Fortinet FortiGate F40. The servers themselves are connected to two access switches: the NUCs via dual 2.5Gbps connections, and the MS-01 units via 10Gbps connections per switch.
I also have a service router (RB5009) that establishes a VPN tunnel to a fellow homelabber. Through this connection, I can utilize his Kubernetes resources, and we&rsquo;ve even tested the NSX Application Platform (NAPP) together. <a href="https://marschall.systems/">Visit marschall.systems</a></p>
<figure><a href="plan.png"><picture><source srcset="/labv5/plan_hu_371db67dcf765384.png" type="image/png">
          <img
            src="/labv5/plan_hu_371db67dcf765384.png"alt="Network setup"width="6206"
            height="3968"/>
        </picture></a><figcaption><p>Network setup (click to enlarge)</p></figcaption></figure>
<p>My network employs dynamic routing, with eBGP as the primary protocol peering all critical components. My BGP NSX lab peers directly with my ToR switch, ensuring high efficiency and seamless integration with the rest of the network. For my labs, I utilize both OSPF and BGP. My OSPF lab runs on two virtual ArubaCX switches and a VyOS router, which has an IP in my standard client network and provides internet access to the OSPF lab via NAT.</p>
<figure><a href="bgp.png"><picture><source srcset="/labv5/bgp_hu_c0d6467d38b8f325.png" type="image/png">
          <img
            src="/labv5/bgp_hu_c0d6467d38b8f325.png"alt="Network bgp setup"width="3111"
            height="2917"/>
        </picture></a><figcaption><p>BGP setup (click to enlarge)</p></figcaption></figure>
<h3 id="storage">Storage</h3>
<p>As my primary storage solution, I use my Unraid server. After implementing several iSCSI optimizations <a href="https://sdn-warrior.org/posts/iscsi-tuning/">(my iSCSI Blog post)</a> and installing the iSCSI Target plugin <a href="https://sdn-warrior.org/posts/unraid-storage/">(my Unraid Blog post)</a>, the server provides a performant iSCSI storage capable of achieving around 2000 MB/s for both read and write operations.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">An iSCSI Target in Linux refers to a service or daemon that enables a Linux server to present storage devices over the network using the iSCSI protocol. This allows other machines, known as iSCSI Initiators, to connect to and use these storage devices as if they were local drives.</div>
    </aside>
<h2 id="firewall">Firewall</h2>
<p>As my firewall, I use a Fortinet FortiGate F40, which I’ve had for two years. I am fortunate to have access to an NFR/LAB license through my employer at an affordable price. The FortiGate F40 handles both firewalling and IDS/IPS functionality. Additionally, I operate it in a dual-stack configuration and leverage its SD-WAN feature to load balance two WAN connections: 5G and DSL.</p>
<h2 id="lessons-learned-and-future-goals">Lessons Learned and Future Goals</h2>
<ul>
<li>
<p>Performance vs. Efficiency: Achieving the right balance between performance and efficiency requires meticulous planning and experimentation. Each hardware choice and configuration tweak contributes to the overall success of the setup.</p>
</li>
<li>
<p>Automation: In the future, I plan to incorporate more automation into my homelab. To achieve this, I have started experimenting with Terraform to streamline deployments and configurations.</p>
</li>
<li>
<p>Scaling Smartly: As my lab has grown, managing power, cooling, and network configurations has become increasingly important.</p>
</li>
<li>
<p>Continuous Improvement: My homelab is a perpetual work in progress. With each iteration, I discover new ways to optimize and expand its capabilities.</p>
</li>
</ul>
<h2 id="current-setup-bill-of-materials-bom">Current Setup: Bill of Materials (BOM)</h2>
<table>
  <thead>
      <tr>
          <th>Quantity</th>
          <th>Component</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><strong>Server</strong></td>
          <td></td>
      </tr>
      <tr>
          <td>4</td>
          <td>Minisforum MS-01 i9 13.Gen</td>
      </tr>
      <tr>
          <td>3</td>
          <td>Asus NUC Pro i7 13.Gen</td>
      </tr>
      <tr>
          <td>1</td>
          <td>Asus NUC Pro i3 13.Gen</td>
      </tr>
      <tr>
          <td><strong>Network</strong></td>
          <td></td>
      </tr>
      <tr>
          <td>2</td>
          <td>Mikrotik CRS309-1G-8S+IN</td>
      </tr>
      <tr>
          <td>1</td>
          <td>MikroTik L009UiGS-RM</td>
      </tr>
      <tr>
          <td>1</td>
          <td>Mikrotik CRS326-4C +20G+2Q</td>
      </tr>
      <tr>
          <td>1</td>
          <td>Mikrotik RB5009UG+S+IN</td>
      </tr>
      <tr>
          <td><strong>Storage</strong></td>
          <td></td>
      </tr>
      <tr>
          <td>1</td>
          <td>Intel NUC Extreme i7 11.Gen</td>
      </tr>
      <tr>
          <td><strong>USV</strong></td>
          <td></td>
      </tr>
      <tr>
          <td>1</td>
          <td>APC Back-UPS Pro 1300VA BR1300MI</td>
      </tr>
      <tr>
          <td><strong>Firewall</strong></td>
          <td></td>
      </tr>
      <tr>
          <td>1</td>
          <td>Fortinet Fortigate F40</td>
      </tr>
      <tr>
          <td><strong>Other</strong></td>
          <td></td>
      </tr>
      <tr>
          <td>1</td>
          <td>21U Rack</td>
      </tr>
      <tr>
          <td>1</td>
          <td>DAC Cable / Ethernet Cable</td>
      </tr>
      <tr>
          <td>2</td>
          <td>Cable Management</td>
      </tr>
      <tr>
          <td>2</td>
          <td>Rack Mount MS-01</td>
      </tr>
      <tr>
          <td>1</td>
          <td>Rack Mount NUC</td>
      </tr>
      <tr>
          <td>2</td>
          <td>Air Vent</td>
      </tr>
      <tr>
          <td>4</td>
          <td>Rack PSU</td>
      </tr>
  </tbody>
</table>
<p>You can also find the detailed BOM with prices <a href="https://docs.google.com/spreadsheets/d/1XK32KJWiLBMlKLlPSKNDwBaX2mg4fmGFN2aEA3SIsEc/edit?pli=1&amp;gid=0#gid=0">here</a>, which I update regularly to reflect any changes in my setup.</p>
<h2 id="final-result">Final result</h2>
<p><figure><picture><source srcset="/labv5/rackmount_hu_9af3bc7d3deeb923.jpg" type="image/jpeg">
          <img
            src="/labv5/rackmount_hu_9af3bc7d3deeb923.jpg"alt="Rackmount"width="1600"
            height="1200"/>
        </picture><figcaption><p>Rackmount MS-01</p></figcaption></figure>
<figure><picture><source srcset="/labv5/rackmount2_hu_f81b19370bb88551.jpg" type="image/jpeg">
          <img
            src="/labv5/rackmount2_hu_f81b19370bb88551.jpg"alt="Rackmount2"width="2250"
            height="1500"/>
        </picture><figcaption><p>Rackmount MS-01</p></figcaption></figure>
<figure><picture><source srcset="/labv5/cable_hu_fd726cc39dd72fb.jpg" type="image/jpeg">
          <img
            src="/labv5/cable_hu_fd726cc39dd72fb.jpg"alt="Rack"width="1500"
            height="1000"/>
        </picture><figcaption><p>Rack view top</p></figcaption></figure>
<figure><picture><source srcset="/labv5/rack_hu_6fd835531a7fbd96.jpg" type="image/jpeg">
          <img
            src="/labv5/rack_hu_6fd835531a7fbd96.jpg"alt="Rack"width="1200"
            height="1600"/>
        </picture><figcaption><p>Rack view</p></figcaption></figure></p>
]]></content>
		</item>
		
		<item>
			<title>How to use QoS in NSX</title>
			<link>https://sdn-warrior.org/posts/nsx-qos/</link>
			<pubDate>Tue, 10 Dec 2024 02:00:40 +0100</pubDate>
			
			<guid>https://sdn-warrior.org/posts/nsx-qos/</guid>
			<description><![CDATA[How to use QoS in NSX]]></description>
			<content type="html"><![CDATA[<h2 id="introduction">Introduction</h2>
<p>Quality of Service (QoS) is a critical aspect of network performance management, especially in complex environments where NSX is deployed. NSX provides powerful QoS capabilities at both the gateway and segment levels, enabling fine-grained control over traffic prioritization and bandwidth allocation. However, understanding the differences between these two levels of QoS implementation is essential for optimizing network performance.</p>
<p>In this article, we’ll delve into how QoS functions on the gateway versus the segment in NSX, explore their respective use cases, and provide insights into selecting the right approach for your network needs. Whether you&rsquo;re managing inter-tenant traffic or fine-tuning internal traffic flows, mastering these distinctions will empower you to make informed decisions and maximize the efficiency of your NSX deployment.</p>
<h2 id="qos-on-an-nsx-segment">QoS on an NSX Segment</h2>
<p>Quality of Service (QoS) on an NSX segment focuses on managing traffic flows within a specific segment, providing comprehensive control over bandwidth and traffic priorities. Unlike gateway-level QoS, which typically manages north-south traffic at the boundary of the network, segment-level QoS applies to all traffic associated with virtual machines (VMs) on the segment, regardless of direction.</p>
<p>To implement QoS at this level, you must first create a Segment Profile, which defines the QoS policies. This includes settings such as ingress and egress traffic shaping, bandwidth guarantees, and DSCP marking. Once configured, this profile is attached to the segment, ensuring that the specified QoS policies are applied consistently to all VMs on that segment.</p>
<figure><a href="qos1.png"><picture><source srcset="/nsx-qos/qos1_hu_a51d2af02a2f3688.png" type="image/png">
          <img
            src="/nsx-qos/qos1_hu_a51d2af02a2f3688.png"alt="QoS Profile"width="1157"
            height="466"/>
        </picture></a><figcaption><p>QoS Profile</p></figcaption></figure>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">A crucial aspect of segment-level QoS is that it impacts all traffic originating from or destined for VMs on the segment. For example, if a Segment Profile specifies a guaranteed bandwidth of 30 Mbps, each VM on that segment will have this guarantee for all traffic, whether it is east-west within the same data center or north-south to external networks.</div>
    </aside>
<h2 id="explanation-of-segment-qos-profile-parameters-in-nsx">Explanation of Segment QoS Profile Parameters in NSX</h2>
<h3 id="mode"><strong>Mode</strong></h3>
<p>Defines how DSCP (Differentiated Services Code Point) values are handled for traffic originating from or destined for a logical port.</p>
<ul>
<li>
<p><strong>Trusted Mode</strong>:</p>
<ul>
<li>The DSCP value from the <strong>inner packet header</strong> (original header) is copied to the <strong>outer IP header</strong> (tunnel header) for IP/IPv6 traffic.</li>
<li>For non-IP/IPv6 traffic, the default DSCP value (0) is used for the outer IP header.</li>
<li>Supported only on <strong>overlay-based logical ports</strong>.</li>
</ul>
</li>
<li>
<p><strong>Untrusted Mode</strong>:</p>
<ul>
<li>For <strong>overlay-based logical ports</strong>, the configured DSCP value is applied to the outer IP header, regardless of the inner packet type.</li>
<li>For <strong>VLAN-based logical ports</strong>, the configured DSCP value is applied to the IP/IPv6 packets&rsquo; outer IP header.</li>
<li>DSCP values can range from 0 to 63.</li>
</ul>
</li>
</ul>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">DSCP settings affect only tunneled traffic and do not apply to traffic within the same hypervisor.</div>
    </aside>
<h3 id="priority"><strong>Priority</strong></h3>
<p>Specifies the <strong>DSCP priority value</strong>, which determines the level of importance for packets. The DSCP priority values range from <strong>0 to 63</strong>, with higher values indicating higher priority traffic.</p>
<h3 id="class-of-service-cos"><strong>Class of Service (CoS)</strong></h3>
<p>Defines the <strong>CoS value</strong>, applicable to VLAN-based logical ports.</p>
<ul>
<li>CoS groups similar traffic types and assigns a service priority level for each type.</li>
<li>Lower-priority traffic may experience reduced throughput or be dropped to ensure better performance for higher-priority traffic.</li>
<li>CoS can also be configured for VLAN ID 0 packets.</li>
<li>The CoS value ranges from <strong>0 to 7</strong>, where <strong>0</strong> indicates best-effort service.</li>
</ul>
<h3 id="ingress">Ingress</h3>
<p>Configures traffic shaping for <strong>outbound traffic</strong> from the VM to the logical network.</p>
<ul>
<li><strong>Average Bandwidth</strong>: The average rate of outbound traffic to prevent network congestion.</li>
<li><strong>Peak Bandwidth</strong>: The maximum traffic rate allowed to support bursts.</li>
<li><strong>Burst Size</strong>: Defines the maximum data size for a traffic burst, calculated as:</li>
</ul>
$$
\frac{\text{Peak Bandwidth (in bits per second)} \times \text{Burst Duration (in seconds)}}{8} = \text{Burst Size (in Bytes)}
$$<p>For example, with an average bandwidth of 30 Mbps, a peak of 60 Mbps, and a burst duration of 0.1 seconds, the burst size would be:</p>
$$
\frac{{60000000} \text{(bits per second)} \times {0.1} \text{(seconds)}}{8} = 750000 \text{ bytes}
$$<p>Default value is 0, which disables rate limiting.</p>
<h3 id="ingress-broadcast">Ingress Broadcast</h3>
<p>Configures traffic shaping for broadcast traffic sent from the VM to the logical network.
Works similarly to the general ingress settings, allowing custom limits for average bandwidth, peak bandwidth, and burst size for broadcast traffic.
Default value is 0, which disables rate limiting for ingress broadcast traffic.</p>
<h3 id="egress">Egress</h3>
<p>Configures traffic shaping for inbound traffic from the logical network to the VM.
Allows setting limits on the average bandwidth, peak bandwidth, and burst size for inbound traffic.
Default value is 0, which disables rate limiting on egress traffic.
By configuring these parameters effectively, you can ensure traffic prioritization, manage congestion, and optimize bandwidth usage for both overlay-based and VLAN-based logical ports in NSX environments.</p>
<h2 id="test-scenario-evaluating-qos-at-the-segment-level">Test Scenario: Evaluating QoS at the segment level</h2>
<p>To demonstrate the differences between a setup with and without QoS, I have created a test environment consisting of two T1 routers, each connected to its own segment and hosting VMs for testing. Both T1 routers are connected to the same Tier-0 (T0) router, providing a shared Internet connection for testing north-south traffic scenarios.</p>
<figure><a href="topo.png"><picture><source srcset="/nsx-qos/topo_hu_1c8a4233e07a51ee.png" type="image/png">
          <img
            src="/nsx-qos/topo_hu_1c8a4233e07a51ee.png"alt="NSX Enviroment"width="912"
            height="936"/>
        </picture></a><figcaption><p>NSX Test Enviroment</p></figcaption></figure>
<p>This test specifically focuses on <strong>QoS at the segment level</strong>, with the primary goal of limiting the VMs on the segment <code>LS-10.10.20.1</code> to a maximum bandwidth of <strong>30 Mbps</strong> using a QoS profile.</p>
<h3 id="t1-router-1-t1-bgp-no-qos"><strong>T1 Router 1: T1-BGP No QoS</strong></h3>
<ul>
<li><strong>Segment</strong>: <code>LS-10.10.10.1</code></li>
<li><strong>QoS Policy</strong>: None applied</li>
<li><strong>VM</strong>: <code>Alpine01</code> IP Adress <code>10.10.10.10</code>
<ul>
<li>Running an instance of iPerf to act as a traffic generator and receiver.</li>
</ul>
</li>
</ul>
<p>This router and segment represent a baseline configuration without any QoS policies, allowing for a comparison of unshaped and unprioritized traffic.</p>
<h3 id="t1-router-2-t1-bgp-qos"><strong>T1 Router 2: T1-BGP QoS</strong></h3>
<ul>
<li><strong>Segment</strong>: <code>LS-10.10.20.1</code></li>
<li><strong>QoS Policy</strong>: A custom QoS profile is applied to this segment, specifically configured to limit bandwidth to 30 Mbps for all associated VMs.</li>
<li><strong>VM</strong>: <code>Alpine02</code> IP Adresse <code>10.10.20.10</code>
<ul>
<li>Equipped with iPerf for traffic generation and reception.</li>
<li>Includes a browser for additional testing and validation purposes.</li>
</ul>
</li>
</ul>
<h3 id="purpose-of-the-test"><strong>Purpose of the Test</strong></h3>
<p>The primary goal of this test is to validate <strong>QoS at the segment level</strong>, focusing on the following:</p>
<ul>
<li>Verifying that VMs connected to <code>LS-10.10.20.1</code> are effectively limited to a bandwidth of 30 Mbps for egress traffic.</li>
<li>Demonstrating that the QoS profile, configured as ingress-only, limits traffic originating from <code>Alpine02</code> to other VMs or the Internet, while traffic from <code>Alpine01</code> to <code>Alpine02</code> remains unrestricted.</li>
<li>Comparing traffic behavior between a segment with and without an applied QoS profile.</li>
<li>Assessing performance consistency under traffic shaping policies.</li>
<li>Measuring the impact of the 30 Mbps limit on both east-west and north-south traffic.</li>
</ul>
<p>These behaviors will be demonstrated using iPerf measurements, highlighting the effectiveness and boundaries of the configured QoS profile.
By analyzing the test results, we can confirm the effectiveness of the QoS profile in limiting segment-level bandwidth and understand its implications for overall network performance.</p>
<h3 id="first-test-iperf-test-from-alpine02-to-alpine03">First Test: iPerf Test from Alpine02 to Alpine03</h3>
<h4 id="test-configuration"><strong>Test Configuration</strong></h4>
<ul>
<li><strong>Source</strong>: Alpine02 (<code>10.10.20.10</code>) connected to the segment <code>LS-10.10.20.1</code> with the QoS profile applied (ingress limited to 30 Mbps).</li>
<li><strong>Destination</strong>: Alpine03 (<code>10.10.10.10</code>) connected to the segment <code>LS-10.10.10.1</code> with no QoS policy applied.</li>
<li><strong>Tool</strong>: iPerf3</li>
<li><strong>Command</strong>: <code>iperf3 -c 10.10.10.10</code></li>
</ul>
<h4 id="result-summary"><strong>Result Summary</strong></h4>
<ul>
<li><strong>Average Sender Bitrate</strong>: 32.6 Mbps</li>
<li><strong>Average Receiver Bitrate</strong>: 30.2 Mbps</li>
<li><strong>Key Observation</strong>: The sender&rsquo;s bitrate fluctuates around the 30 Mbps mark, as expected due to the QoS ingress limitation applied to the LS-10.10.20.1 segment. Receiver bitrate is consistent with the QoS configuration, confirming that the profile effectively limits traffic from Alpine02 to Alpine03.</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">alpine02:~# iperf3 -c 10.10.10.10
</span></span><span class="line"><span class="cl">Connecting to host 10.10.10.10, port 5201
</span></span><span class="line"><span class="cl">[  5] local 10.10.20.10 port 40468 connected to 10.10.10.10 port 5201
</span></span><span class="line"><span class="cl">[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
</span></span><span class="line"><span class="cl">[  5]   0.00-1.00   sec  6.12 MBytes  51.3 Mbits/sec    0    522 KBytes       
</span></span><span class="line"><span class="cl">[  5]   1.00-2.00   sec  4.25 MBytes  35.7 Mbits/sec    0    522 KBytes       
</span></span><span class="line"><span class="cl">[  5]   2.00-3.00   sec  3.25 MBytes  27.3 Mbits/sec    0    522 KBytes       
</span></span><span class="line"><span class="cl">[  5]   3.00-4.00   sec  3.12 MBytes  26.2 Mbits/sec    0    522 KBytes       
</span></span><span class="line"><span class="cl">[  5]   4.00-5.00   sec  4.25 MBytes  35.7 Mbits/sec    0    522 KBytes       
</span></span><span class="line"><span class="cl">[  5]   5.00-6.00   sec  3.12 MBytes  26.2 Mbits/sec    0    522 KBytes       
</span></span><span class="line"><span class="cl">[  5]   6.00-7.00   sec  4.25 MBytes  35.6 Mbits/sec    0    522 KBytes       
</span></span><span class="line"><span class="cl">[  5]   7.00-8.00   sec  3.12 MBytes  26.2 Mbits/sec    2    365 KBytes       
</span></span><span class="line"><span class="cl">[  5]   8.00-9.00   sec  3.25 MBytes  27.3 Mbits/sec    0    365 KBytes       
</span></span><span class="line"><span class="cl">[  5]   9.00-10.00  sec  4.12 MBytes  34.6 Mbits/sec    0    365 KBytes       
</span></span><span class="line"><span class="cl">- - - - - - - - - - - - - - - - - - - - - - - - -
</span></span><span class="line"><span class="cl">[ ID] Interval           Transfer     Bitrate         Retr
</span></span><span class="line"><span class="cl">[  5]   0.00-10.00  sec  38.9 MBytes  32.6 Mbits/sec    2        sender
</span></span><span class="line"><span class="cl">[  5]   0.00-10.00  sec  36.0 MBytes  30.2 Mbits/sec             receiver
</span></span></code></pre></div><h3 id="second-test-iperf-test-from-alpine01-to-alpine02">Second Test: iPerf Test from Alpine01 to Alpine02</h3>
<h4 id="test-configuration-1"><strong>Test Configuration</strong></h4>
<ul>
<li><strong>Source</strong>: Alpine01 (<code>10.10.10.10</code>) connected to the segment <code>LS-10.10.10.1</code> with no QoS policy applied.</li>
<li><strong>Destination</strong>: Alpine02 (<code>10.10.20.10</code>) connected to the segment <code>LS-10.10.20.1</code> with the QoS profile applied (ingress limited to 30 Mbps).</li>
<li><strong>Tool</strong>: iPerf3</li>
<li><strong>Command</strong>: <code>iperf3 -c 10.10.20.10</code></li>
</ul>
<h4 id="result-summary-1"><strong>Result Summary</strong></h4>
<ul>
<li><strong>Average Sender Bitrate</strong>: 2.25 Gbps</li>
<li><strong>Average Receiver Bitrate</strong>: 2.25 Gbps</li>
<li><strong>Key Observation</strong>: The traffic from Alpine01 to Alpine02 is not limited by the QoS profile, as expected. This confirms the QoS profile applies only to ingress traffic on the <code>LS-10.10.20.1</code> segment.</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">alpine01:~# iperf3 -c 10.10.20.10
</span></span><span class="line"><span class="cl">Connecting to host 10.10.20.10, port 5201
</span></span><span class="line"><span class="cl">[  5] local 10.10.10.10 port 50482 connected to 10.10.20.10 port 5201
</span></span><span class="line"><span class="cl">[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
</span></span><span class="line"><span class="cl">[  5]   0.00-1.00   sec   269 MBytes  2.25 Gbits/sec  150   2.08 MBytes       
</span></span><span class="line"><span class="cl">[  5]   1.00-2.00   sec   268 MBytes  2.25 Gbits/sec    0   2.19 MBytes       
</span></span><span class="line"><span class="cl">[  5]   2.00-3.00   sec   268 MBytes  2.25 Gbits/sec  172   1.21 MBytes       
</span></span><span class="line"><span class="cl">[  5]   3.00-4.00   sec   267 MBytes  2.24 Gbits/sec  150    997 KBytes       
</span></span><span class="line"><span class="cl">[  5]   4.00-5.00   sec   267 MBytes  2.24 Gbits/sec   11    673 KBytes       
</span></span><span class="line"><span class="cl">[  5]   5.00-6.00   sec   268 MBytes  2.24 Gbits/sec    0    928 KBytes       
</span></span><span class="line"><span class="cl">[  5]   6.00-7.00   sec   268 MBytes  2.25 Gbits/sec    0   1.10 MBytes       
</span></span><span class="line"><span class="cl">[  5]   7.00-8.00   sec   268 MBytes  2.25 Gbits/sec    0   1.27 MBytes       
</span></span><span class="line"><span class="cl">[  5]   8.00-9.00   sec   269 MBytes  2.25 Gbits/sec    0   1.42 MBytes       
</span></span><span class="line"><span class="cl">[  5]   9.00-10.00  sec   268 MBytes  2.25 Gbits/sec    6   1.11 MBytes       
</span></span><span class="line"><span class="cl">- - - - - - - - - - - - - - - - - - - - - - - - -
</span></span><span class="line"><span class="cl">[ ID] Interval           Transfer     Bitrate         Retr
</span></span><span class="line"><span class="cl">[  5]   0.00-10.00  sec  2.62 GBytes  2.25 Gbits/sec  489             sender
</span></span><span class="line"><span class="cl">[  5]   0.00-10.01  sec  2.62 GBytes  2.25 Gbits/sec                  receiver
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">iperf Done.
</span></span></code></pre></div>
    <aside class="admonition attention">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" class="feather feather-link" width="24" height="24" viewBox="0 0 24 24"
      fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
      <path d="M10 13a5 5 0 0 0 7.54.54l3-3a5 5 0 0 0-7.07-7.07l-1.72 1.71"></path>
      <path d="M14 11a5 5 0 0 0-7.54-.54l-3 3a5 5 0 0 0 7.07 7.07l1.71-1.71"></path>
   </svg></div><b>Note on QoS Profiles and Perspective</b>
        </div>
        <div class="admonition-content"><p>When working with QoS profiles on a segment, it is important to understand that the traffic shaping perspective is always <strong>from the segment&rsquo;s point of view</strong>. For example:</p>
<ul>
<li>A profile that shapes <strong>ingress traffic</strong> is applied to traffic <strong>entering the segment</strong>.</li>
<li>From the VM&rsquo;s perspective, this same traffic is considered <strong>egress traffic</strong> (leaving the VM).
This distinction can initially be confusing but is crucial for correctly interpreting and configuring QoS policies in NSX environments.</li>
</ul></div>
    </aside>
<h4 id="qos-profiles-apply-to-all-vms-on-the-segment"><strong>QoS Profiles Apply to All VMs on the Segment</strong></h4>
<p>QoS profiles applied at the segment level are effective for <strong>all VMs connected to that segment</strong>. In our example, this means:</p>
<ul>
<li>Every VM connected to the segment <code>LS-10.10.20.1</code> is limited to a maximum <strong>outgoing traffic rate of 30 Mbps</strong>.</li>
<li>The QoS profile ensures this bandwidth limit is enforced uniformly, regardless of the specific VM or traffic destination.</li>
</ul>
<p>This behavior highlights the segment-wide scope of QoS policies, making them a powerful tool for managing traffic flow consistently across all connected VMs.</p>
<h2 id="qos-on-t1-gateway-level">QoS on T1 Gateway Level</h2>
<p>To evaluate QoS at the Tier-1 Gateway level, the test conditions remain the same as in the segment-level QoS tests. However, the QoS profile will now be applied directly to the T1 Gateway. Before proceeding, the QoS profile is removed from the segment <code>LS-10.10.20.1</code>.</p>
<h2 id="qos-profile-configuration-on-t1-gateway"><strong>QoS Profile Configuration on T1 Gateway</strong></h2>
<p>For the T1 Gateway, the QoS profile is applied with the following characteristics:</p>
<ul>
<li><strong>Type</strong>: Ingress</li>
<li><strong>Committed Bandwidth</strong>: 30 Mbps</li>
<li><strong>Burst Size</strong>: Configured based on constraints (explained below).</li>
</ul>
<p>Unlike segment-level QoS, the T1 Gateway QoS profile allows only the configuration of <strong>Committed Bandwidth</strong> and <strong>Burst Size</strong> in bytes. The direction (Ingress or Egress) is explicitly specified when applying the profile to the gateway.</p>
<h3 id="limitations-of-gateway-qos-profiles"><strong>Limitations of Gateway QoS Profiles</strong></h3>
<ul>
<li><strong>Supported only on Tier-1 Gateways</strong>:
<ul>
<li>QoS profiles can only be applied to T1 Gateways, not to Tier-0 Gateways or any other components.</li>
</ul>
</li>
<li><strong>Applies only to North-South Traffic</strong>:
<ul>
<li>QoS policies on Tier-1 Gateways are limited to north-south traffic and do not affect overlay segments or service interfaces connected to the gateway.</li>
</ul>
</li>
<li><strong>Requires Active-Standby Mode</strong>:
<ul>
<li>The T1 Gateway must be in active-standby mode with an NSX Edge cluster for the QoS profile to function.</li>
</ul>
</li>
<li><strong>Not Supported for Distributed Routing</strong>:
<ul>
<li>Gateways configured for distributed routing cannot have QoS profiles applied.</li>
</ul>
</li>
</ul>
<h3 id="burst-size-calculation"><strong>Burst Size Calculation</strong></h3>
<p>The calculation of the <strong>Burst Size</strong> for a T1 Gateway is more complex due to additional constraints. The Burst Size must satisfy the following:</p>
<ol>
<li>
<p><strong>Token Refill per Interval</strong>:</p>
\[B \geq \frac{R \times 1000000 \times I}{1000 \times 8} \]<p>
Where:</p>
</li>
</ol>
<ul>
<li>\( B \): Burst Size in Bytes</li>
<li>\( R \): Committed Bandwidth in Mbps</li>
<li>\( I \): Refill Interval in milliseconds (e.g., 1 ms)</li>
</ul>
<ol start="2">
<li><strong>Minimum Refill Interval</strong>:
\[ B \geq \frac{R \times 1000000 \times 1}{1,000 \times 8} \]</li>
</ol>
<ul>
<li>The minimum interval \( I \) is 1 ms to account for dataplane CPU usage.</li>
</ul>
<ol start="3">
<li><strong>MTU Constraint</strong>:
\[ B \geq MTU \] of the Service Router (SR) port.</li>
</ol>
<ul>
<li>The Burst Size must accommodate at least one full MTU-size packet.</li>
</ul>
<p>The effective Burst Size must satisfy all three constraints. Therefore, the configured Burst Size is determined as:</p>
\[
B = \text{Max} \left( \frac{R \times 1000000 \times I}{1000 \times 8}, \frac{R \times 1000000 \times 1}{1000 \times 8}, MTU \right)
\]<h3 id="burst-size-calculation-example">Burst Size Calculation Example</h3>
<h4 id="parameters"><strong>Parameters</strong></h4>
<ul>
<li>\( R \): <strong>Committed Bandwidth</strong> = 30 Mbps</li>
<li>\( I \): <strong>Refill Interval</strong> = 1000 ms (1 second)</li>
<li>\( MTU \): <strong>Maximum Transmission Unit</strong> = 1500 bytes</li>
</ul>
<p>The Burst Size \( B \) must satisfy the following constraints:</p>
<h4 id="1-token-refill-per-interval"><strong>1. Token Refill per Interval</strong></h4>
\[
B \geq \frac{R \times 1000000 \times I}{1000 \times 8}
\]<p>
Substitute the values:
</p>
\[
B \geq \frac{30 \times 1000000 \times 1000}{1000 \times 8}
\]<p>
</p>
\[
B \geq \frac{30000000000}{8000}
\]<p>
</p>
\[
B \geq 3750000 \, \text{bytes}
\]<h4 id="2-minimum-refill-interval"><strong>2. Minimum Refill Interval</strong></h4>
\[
B \geq \frac{R \times 1000000 \times 1000}{1000 \times 8}
\]<p>
This calculation remains the same as in the previous case since \( I = 1000 \, \text{ms} \):
</p>
\[
B \geq 3750000 \, \text{bytes}
\]<h4 id="3-mtu-constraint"><strong>3. MTU Constraint</strong></h4>
\[
B \geq MTU
\]<p>
</p>
\[
B \geq 1500 \, \text{bytes}
\]<h4 id="final-burst-size"><strong>Final Burst Size</strong></h4>
<p>The Burst Size \( B \) must satisfy <strong>all three constraints</strong>, so:
</p>
\[
B = \text{Max}(3750000, 3750000, 1500)
\]<p>
</p>
\[
B = 3750000 \, \text{bytes}
\]<h3 id="result"><strong>Result</strong></h3>
<p>The minimum Burst Size required is <strong>3750000 bytes</strong> to satisfy all constraints with the given parameters.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Minimum Refill Interval</b>
        </div>
        <div class="admonition-content">Use the <code>get dataplane</code> command from the NSX Edge CLI to retrieve the time interval, Qos_wakeup_interval_ms. The default value for Qos_wakeup_interval_ms is 50ms. However, this value is automatically adjusted by the dataplane based on the QoS configuration. In my lab, the QoS_wakeup_interval is relatively high, which is partly due to my hardware and the fact that it is a nested lab. In production environments, this value is typically lower.</div>
    </aside>
<h3 id="implementation-for-this-test"><strong>Implementation for This Test</strong></h3>
<ul>
<li>The T1 Gateway is configured with a 30 Mbps Committed Bandwidth and a Burst Size that satisfies the constraints above.</li>
<li>The profile is applied in <strong>Ingress</strong> mode to test incoming north-south traffic through the T1 Gateway.</li>
</ul>
<figure><a href="t1qos.png"><picture><source srcset="/nsx-qos/t1qos_hu_4398db0d754d0dfa.png" type="image/png">
          <img
            src="/nsx-qos/t1qos_hu_4398db0d754d0dfa.png"alt="T1 Qos Profile"width="1433"
            height="821"/>
        </picture></a><figcaption><p>T1 Qos Profile</p></figcaption></figure>
<p>This setup will help analyze how traffic shaping and rate limiting function at the T1 Gateway level compared to the segment-level QoS.</p>
<h2 id="first-test-iperf-test-from-alpine01-to-alpine02-t1-gateway-qos">First Test: iPerf Test from Alpine01 to Alpine02 (T1 Gateway QoS)</h2>
<h3 id="test-configuration-2"><strong>Test Configuration</strong></h3>
<ul>
<li><strong>Source</strong>: Alpine01 (<code>10.10.10.10</code>) connected to the segment <code>LS-10.10.10.1</code> with no QoS profile applied.</li>
<li><strong>Destination</strong>: Alpine02 (<code>10.10.20.10</code>) connected to the segment <code>LS-10.10.20.1</code>.</li>
<li><strong>QoS Profile</strong>: Applied to the T1 Gateway with:
<ul>
<li><strong>Type</strong>: Ingress</li>
<li><strong>Committed Bandwidth</strong>: 30 Mbps</li>
<li><strong>Burst Size</strong>: Configured according to the calculated constraints.</li>
</ul>
</li>
<li><strong>Tool</strong>: iPerf3</li>
<li><strong>Command</strong>: <code>iperf3 -c 10.10.20.10</code></li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">alpine01:~# iperf3 -c 10.10.20.10 
</span></span><span class="line"><span class="cl">Connecting to host 10.10.20.10, port 5201
</span></span><span class="line"><span class="cl">[  5] local 10.10.10.10 port 33292 connected to 10.10.20.10 port 5201
</span></span><span class="line"><span class="cl">[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
</span></span><span class="line"><span class="cl">[  5]   0.00-1.00   sec  7.25 MBytes  60.8 Mbits/sec  283   5.66 KBytes       
</span></span><span class="line"><span class="cl">[  5]   1.00-2.00   sec  3.25 MBytes  27.3 Mbits/sec  369   5.66 KBytes       
</span></span><span class="line"><span class="cl">[  5]   2.00-3.00   sec  3.00 MBytes  25.2 Mbits/sec  233   52.3 KBytes       
</span></span><span class="line"><span class="cl">[  5]   3.00-4.00   sec  3.75 MBytes  31.5 Mbits/sec  392   7.07 KBytes       
</span></span><span class="line"><span class="cl">[  5]   4.00-5.00   sec  3.12 MBytes  26.2 Mbits/sec  284   41.0 KBytes       
</span></span><span class="line"><span class="cl">[  5]   5.00-6.00   sec  3.88 MBytes  32.5 Mbits/sec  356   8.48 KBytes       
</span></span><span class="line"><span class="cl">[  5]   6.00-7.00   sec  3.25 MBytes  27.3 Mbits/sec  270   7.07 KBytes       
</span></span><span class="line"><span class="cl">[  5]   7.00-8.00   sec  3.62 MBytes  30.4 Mbits/sec  315    110 KBytes       
</span></span><span class="line"><span class="cl">[  5]   8.00-9.00   sec  3.50 MBytes  29.4 Mbits/sec  407   7.07 KBytes       
</span></span><span class="line"><span class="cl">[  5]   9.00-10.00  sec  3.00 MBytes  25.2 Mbits/sec  245   12.7 KBytes       
</span></span><span class="line"><span class="cl">- - - - - - - - - - - - - - - - - - - - - - - - -
</span></span><span class="line"><span class="cl">[ ID] Interval           Transfer     Bitrate         Retr
</span></span><span class="line"><span class="cl">[  5]   0.00-10.00  sec  37.6 MBytes  31.6 Mbits/sec  3154             sender
</span></span><span class="line"><span class="cl">[  5]   0.00-10.00  sec  37.1 MBytes  31.1 Mbits/sec                  receiver
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">iperf Done.
</span></span></code></pre></div><h3 id="result-summary-2">Result Summary</h3>
<ul>
<li>Average Sender Bitrate: 31.6 Mbps</li>
<li>Average Receiver Bitrate: 31.1 Mbps</li>
<li>Total Retransmissions: 3154</li>
</ul>
<h2 id="second-test-iperf-test-from-alpine02-to-alpine03-t1-gateway-qos">Second Test: iPerf Test from Alpine02 to Alpine03 (T1 Gateway QoS)</h2>
<h3 id="test-configuration-3"><strong>Test Configuration</strong></h3>
<ul>
<li><strong>Source</strong>: Alpine02 (<code>10.10.20.10</code>) connected to the segment <code>LS-10.10.20.1</code> with no QoS profile applied.</li>
<li><strong>Destination</strong>: Alpine03 (<code>10.10.10.10</code>) connected to the segment <code>LS-10.10.10.1</code> with no QoS profile applied.</li>
<li><strong>QoS Profile</strong>: Applied to the T1 Gateway in <strong>Ingress</strong> mode, limiting traffic to 30 Mbps for ingress traffic  from T0 to the T1 gateway.</li>
<li><strong>Tool</strong>: iPerf3</li>
<li><strong>Command</strong>: <code>iperf3 -c 10.10.10.10</code></li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">alpine02:~# iperf3 -c 10.10.10.10
</span></span><span class="line"><span class="cl">Connecting to host 10.10.10.10, port 5201
</span></span><span class="line"><span class="cl">[  5] local 10.10.20.10 port 50200 connected to 10.10.10.10 port 5201
</span></span><span class="line"><span class="cl">[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
</span></span><span class="line"><span class="cl">[  5]   0.00-1.00   sec   249 MBytes  2.09 Gbits/sec    5    615 KBytes       
</span></span><span class="line"><span class="cl">[  5]   1.00-2.00   sec   267 MBytes  2.24 Gbits/sec    0    885 KBytes       
</span></span><span class="line"><span class="cl">[  5]   2.00-3.00   sec   159 MBytes  1.34 Gbits/sec   73    290 KBytes       
</span></span><span class="line"><span class="cl">[  5]   3.00-4.00   sec   248 MBytes  2.08 Gbits/sec    0    677 KBytes       
</span></span><span class="line"><span class="cl">[  5]   4.00-5.00   sec   261 MBytes  2.19 Gbits/sec    0    928 KBytes       
</span></span><span class="line"><span class="cl">[  5]   5.00-6.00   sec   264 MBytes  2.22 Gbits/sec    1    792 KBytes       
</span></span><span class="line"><span class="cl">[  5]   6.00-7.00   sec   260 MBytes  2.18 Gbits/sec   26    441 KBytes       
</span></span><span class="line"><span class="cl">[  5]   7.00-8.00   sec   257 MBytes  2.16 Gbits/sec    0    766 KBytes       
</span></span><span class="line"><span class="cl">[  5]   8.00-9.00   sec   262 MBytes  2.20 Gbits/sec    7    585 KBytes       
</span></span><span class="line"><span class="cl">[  5]   9.00-10.00  sec   262 MBytes  2.19 Gbits/sec    0    858 KBytes       
</span></span><span class="line"><span class="cl">- - - - - - - - - - - - - - - - - - - - - - - - -
</span></span><span class="line"><span class="cl">[ ID] Interval           Transfer     Bitrate         Retr
</span></span><span class="line"><span class="cl">[  5]   0.00-10.00  sec  2.43 GBytes  2.09 Gbits/sec  112             sender
</span></span><span class="line"><span class="cl">[  5]   0.00-10.00  sec  2.43 GBytes  2.09 Gbits/sec                  receiver
</span></span></code></pre></div><h3 id="result-summary-3">Result Summary</h3>
<ul>
<li>Average Sender Bitrate: 2.09 Gbps</li>
<li>Average Receiver Bitrate: 2.09 Gbps</li>
<li>Total Retransmissions: 112</li>
</ul>
<h2 id="third-test-iperf-test-from-alpine01-to-alpine02-concurrent-traffic">Third Test: iPerf Test from Alpine01 to Alpine02 (Concurrent Traffic)</h2>
<h3 id="test-configuration-4"><strong>Test Configuration</strong></h3>
<ul>
<li><strong>Source</strong>: Alpine01 (<code>10.10.10.10</code>) connected to the segment <code>LS-10.10.10.1</code> with no QoS profile applied.</li>
<li><strong>Destination</strong>: Alpine02 (<code>10.10.20.10</code>) connected to the segment <code>LS-10.10.20.1</code>.</li>
<li><strong>Additional Traffic</strong>: A second VM, connected to the same T1 Gateway as Alpine02, is concurrently receiving data to simulate shared bandwidth conditions.</li>
<li><strong>QoS Profile</strong>: Applied to the T1 Gateway in <strong>Ingress</strong> mode, limiting traffic to 30 Mbps for ingress traffic to the gateway.</li>
<li><strong>Tool</strong>: iPerf3</li>
<li><strong>Command</strong>: <code>iperf3 -c 10.10.20.10</code></li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">alpine01:~# iperf3 -c 10.10.20.10 
</span></span><span class="line"><span class="cl">Connecting to host 10.10.20.10, port 5201
</span></span><span class="line"><span class="cl">[  5] local 10.10.10.10 port 40898 connected to 10.10.20.10 port 5201
</span></span><span class="line"><span class="cl">[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
</span></span><span class="line"><span class="cl">[  5]   0.00-1.00   sec  2.88 MBytes  24.1 Mbits/sec  475   8.48 KBytes       
</span></span><span class="line"><span class="cl">[  5]   1.00-2.00   sec  2.50 MBytes  21.0 Mbits/sec  212   53.7 KBytes       
</span></span><span class="line"><span class="cl">[  5]   2.00-3.00   sec  2.12 MBytes  17.8 Mbits/sec  233   9.90 KBytes       
</span></span><span class="line"><span class="cl">[  5]   3.00-4.00   sec  2.12 MBytes  17.8 Mbits/sec  145   1.41 KBytes       
</span></span><span class="line"><span class="cl">[  5]   4.00-5.00   sec  2.50 MBytes  21.0 Mbits/sec  237   14.1 KBytes       
</span></span><span class="line"><span class="cl">[  5]   5.00-6.00   sec  1.62 MBytes  13.6 Mbits/sec  145   22.6 KBytes       
</span></span><span class="line"><span class="cl">[  5]   6.00-7.00   sec  3.12 MBytes  26.2 Mbits/sec  331   5.66 KBytes       
</span></span><span class="line"><span class="cl">[  5]   7.00-8.00   sec  1.00 MBytes  8.39 Mbits/sec  155   7.07 KBytes       
</span></span><span class="line"><span class="cl">[  5]   8.00-9.00   sec  2.62 MBytes  22.0 Mbits/sec  142   86.3 KBytes       
</span></span><span class="line"><span class="cl">[  5]   9.00-10.00  sec  3.12 MBytes  26.2 Mbits/sec  349   22.6 KBytes       
</span></span><span class="line"><span class="cl">- - - - - - - - - - - - - - - - - - - - - - - - -
</span></span><span class="line"><span class="cl">[ ID] Interval           Transfer     Bitrate         Retr
</span></span><span class="line"><span class="cl">[  5]   0.00-10.00  sec  23.6 MBytes  19.8 Mbits/sec  2424             sender
</span></span><span class="line"><span class="cl">[  5]   0.00-10.00  sec  22.4 MBytes  18.8 Mbits/sec                  receiver
</span></span></code></pre></div><h3 id="result-summary-4">Result Summary</h3>
<ul>
<li>Average Sender Bitrate: 19.8 Mbps</li>
<li>Average Receiver Bitrate: 18.8 Mbps</li>
<li>Total Retransmissions: 2424</li>
<li>The QoS profile at the T1 Gateway effectively limits total ingress bandwidth to 30 Mbps. However, the concurrent traffic from the second VM reduces the available bandwidth for Alpine01.</li>
</ul>

    <aside class="admonition attention">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" class="feather feather-link" width="24" height="24" viewBox="0 0 24 24"
      fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
      <path d="M10 13a5 5 0 0 0 7.54.54l3-3a5 5 0 0 0-7.07-7.07l-1.72 1.71"></path>
      <path d="M14 11a5 5 0 0 0-7.54-.54l-3 3a5 5 0 0 0 7.07 7.07l1.71-1.71"></path>
   </svg></div><b>Note on QoS Profiles and Perspective</b>
        </div>
        <div class="admonition-content"><p>For a QoS profile applied at the gateway level, the <strong>Ingress</strong> direction refers to traffic entering the Tier-1 (T1) Gateway from the Tier-0 (T0) Gateway. This means:</p>
<ul>
<li>From the perspective of a VM, this is indeed <strong>ingress traffic</strong>, as it refers to traffic arriving at the VM via the T1 Gateway.</li>
<li>The <strong>egress traffic</strong> of a VM (traffic leaving the VM) is not affected by an ingress QoS profile applied at the T1 Gateway.
This distinction ensures that the QoS profile at the gateway level is only applied to traffic coming from the T0 to the T1, without influencing outgoing traffic generated by the VM.</li>
</ul></div>
    </aside>
<h2 id="summary">Summary</h2>
<p>When working with QoS profiles in NSX, it is important to understand the key differences and use cases for <strong>Gateway QoS Profiles</strong> and <strong>Segment QoS Profiles</strong>:</p>
<h3 id="gateway-qos-profiles"><strong>Gateway QoS Profiles</strong></h3>
<ul>
<li><strong>Shared Bandwidth</strong>: The configured bandwidth applies to the <strong>total traffic</strong> for all VMs connected to the same T1 Gateway. This includes all segments attached to that gateway.</li>
<li><strong>Practical Use Case</strong>: Gateway QoS profiles are ideal for scenarios like test environments, where you want to limit the total available bandwidth across all VMs and segments.</li>
<li><strong>Traffic Direction</strong>: The QoS direction is critical:
<ul>
<li><strong>Ingress QoS</strong>: Limits traffic entering the T1 Gateway (from T0 to T1), affecting ingress traffic from the VM&rsquo;s perspective.</li>
<li><strong>Egress QoS</strong>: Limits traffic leaving the T1 Gateway (from T1 to T0), affecting egress traffic from the VM&rsquo;s perspective.</li>
</ul>
</li>
</ul>
<h3 id="segment-qos-profiles"><strong>Segment QoS Profiles</strong></h3>
<ul>
<li><strong>Individual Bandwidth Allocation</strong>: Each VM connected to the segment receives the bandwidth specified in the profile. VMs do <strong>not share</strong> the bandwidth; they each receive the assigned limit (assuming the total environment can provide the required bandwidth).</li>
<li><strong>Flexibility</strong>: Segment QoS profiles offer more granular control and can be used for more than just rate limiting. For example, they can prioritize or shape specific types of traffic.</li>
<li><strong>Traffic Direction</strong>: The direction in the profile (Ingress or Egress) must be carefully considered based on what you want to achieve.</li>
</ul>
<h3 id="key-considerations"><strong>Key Considerations</strong></h3>
<ul>
<li><strong>Shared vs. Dedicated Bandwidth</strong>: Use Gateway QoS Profiles when you want to manage total bandwidth collectively for all VMs. Use Segment QoS Profiles when you need to allocate specific bandwidth to individual VMs.</li>
<li><strong>Performance Impact</strong>: Avoid using Gateway QoS Profiles in scenarios requiring high performance and scalability. The active/standby limitation can create bottlenecks, making distributed T1 Gateways the better choice for performance-critical environments.</li>
</ul>
<p>By understanding these differences, you can effectively apply QoS profiles to achieve desired traffic shaping and bandwidth management goals in your NSX environment.</p>
]]></content>
		</item>
		
		<item>
			<title>NSX 4.2.1.1 Hotfix Update</title>
			<link>https://sdn-warrior.org/posts/nsx4_2_1_1/</link>
			<pubDate>Mon, 09 Dec 2024 14:55:00 +0100</pubDate>
			
			<guid>https://sdn-warrior.org/posts/nsx4_2_1_1/</guid>
			<description><![CDATA[short summary of the NSX update 4.2.1.1]]></description>
			<content type="html"><![CDATA[<p>The latest NSX update delivers a comprehensive set of fixes to enhance stability, performance, and security. Here’s a summary of the resolved issues and their impact:</p>
<h2 id="1-enhanced-stability-for-virtual-environments">1. Enhanced Stability for Virtual Environments</h2>
<ul>
<li>Loss of IP Bindings after VMotion (Issue 3453866): Addressed the removal of IP bindings and logical ports associated with VMs during vMotion events.</li>
<li>Critical ESXi Errors with UENS (Issue 3456283): Fixed intermittent PSODs caused by control priority filter lookups, ensuring smoother ESXi operations.</li>
<li>Portgroup Creation Issue (Issue 3458111): Resolved the creation of additional portgroups during full sync, preventing potential vCenter crashes.</li>
<li>Transport Zone Reference Issue (Issue 3454291): Fixed transport zone profile mismatches, restoring vMotion and service functionality.</li>
</ul>
<h2 id="2-improved-network-performance">2. Improved Network Performance</h2>
<ul>
<li>TCP Packet Drops in EDP (Issue 3457047): Resolved issues causing TCP connection drops when using Enhanced Datapath configurations.</li>
<li>Packet Reordering with LRO (Issue 3456533): Fixed packet reordering issues when HW Large Receive Offload is enabled, improving TCP throughput.</li>
<li>Reduced Traffic Performance with UENS and LRO (Issue 3456289): Addressed performance degradation in VSAN workloads.</li>
</ul>
<h2 id="3-robust-security-and-monitoring">3. Robust Security and Monitoring</h2>
<ul>
<li>NSX UI Alarm for Metrics Delivery Failure (Issue 3456663): Fixed authentication issues following certificate changes to restore metrics delivery.</li>
<li>IDPS and TLS Prevention (Issue 3458040): Enhanced malicious traffic prevention by resolving decryption issues with IDPS.</li>
<li>IDPS Events and Certificate Verification (Issue 3458038): Restored the flow of IDPS events to Security Intelligence by fixing Kafka channel errors.</li>
</ul>
<h2 id="4-stability-in-upgrades-and-configurations">4. Stability in Upgrades and Configurations</h2>
<ul>
<li>NSX Manager Slowness (Issue 3453882): Resolved slowness and instability in NSX Manager post-upgrade.</li>
<li>Edge Node IP Table Rules (Issue 3452795): Ensured proper application of IP table rules on Edge nodes.</li>
<li>NSX Configuration Realization (Issue 3452794): Fixed issues preventing configuration realization on Transport Nodes.</li>
</ul>
<h2 id="5-enhancements-in-distributed-firewall-and-flow-management">5. Enhancements in Distributed Firewall and Flow Management</h2>
<ul>
<li>DFW Rules During Upgrade (Issue 3450247): Mitigated periods where DFW rules were disabled during the upgrade process.</li>
<li>Flow Exporter Alarms (Issues 3429787, 3456644): Fixed alarms and restored flow export functionality for Security Intelligence.</li>
</ul>
<h2 id="6-overlay-and-connectivity-improvements">6. Overlay and Connectivity Improvements</h2>
<ul>
<li>Overlay Segment Connectivity (Issue 3450019): Addressed connectivity loss in Overlay Segments when Edge TEP groups were enabled.</li>
</ul>
<h2 id="conclusion">Conclusion</h2>
<p>This NSX update resolves critical issues to improve operational reliability, security, and performance in virtual environments. For a seamless experience, upgrading to this release is highly recommended. As always, thorough testing in a staging environment before deployment in production is advised.</p>
]]></content>
		</item>
		
		<item>
			<title>iSCSI Tuning</title>
			<link>https://sdn-warrior.org/posts/iscsi-tuning/</link>
			<pubDate>Sun, 08 Dec 2024 12:21:20 +0100</pubDate>
			
			<guid>https://sdn-warrior.org/posts/iscsi-tuning/</guid>
			<description><![CDATA[Optimizing iSCSI Performance in an Unraid Environment]]></description>
			<content type="html"><![CDATA[<p>In my setup, I use iSCSI in combination with Unraid to create a DIY block storage solution. Unraid, with its flexibility, serves as the foundation, and I utilize the Linux iSCSI implementation installed via a plugin to enable block-level storage.</p>
<p>For my setup, I use an Intel NUC of the 13th generation, equipped with two 2.5G network adapters. These provide the necessary connectivity for storage traffic. I configured two VMkernel (VMK) adapters specifically for iSCSI traffic, ensuring redundancy and optimized throughput.</p>
<p>To further enhance performance, I’ve implemented several optimizations, including fine-tuning settings on my ESXi servers.</p>
<h2 id="optimize-maxiosizekb">Optimize MaxIoSizeKB</h2>
<p>One such optimization involves adjusting the maximum I/O size for iSCSI traffic.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">By default, VMware ESXi uses a <em><strong>MaxIoSizeKB</strong></em> value of 128 KB.</div>
    </aside>
<p>While this is sufficient for many setups, it may not be optimal for environments
like mine, where jumbo frames are enabled across the network. Larger packets perform better in such a configuration, reducing overhead and increasing throughput.</p>
<p>To take advantage of my network&rsquo;s capabilities, I increased the <em><strong>MaxIoSizeKB</strong></em> parameter to 512 KB</p>
<p>To configure this, I ran the following command on my ESXi host:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">esxcli system settings advanced set -o /ISCSI/MaxIoSizeKB -i 512
</span></span></code></pre></div><p>This change allows the iSCSI initiator to send larger I/O requests, improving data transfer efficiency in my jumbo-frame-enabled network. With this configuration, I noticed a significant improvement in performance, as the network could handle larger blocks of data more effectively.</p>

    <aside class="admonition attention">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" class="feather feather-link" width="24" height="24" viewBox="0 0 24 24"
      fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
      <path d="M10 13a5 5 0 0 0 7.54.54l3-3a5 5 0 0 0-7.07-7.07l-1.72 1.71"></path>
      <path d="M14 11a5 5 0 0 0-7.54-.54l-3 3a5 5 0 0 0 7.07 7.07l1.71-1.71"></path>
   </svg></div><b>Attention</b>
        </div>
        <div class="admonition-content">For this change to take effect, a host reboot is required. After restarting the ESXi server, the new value will be applied, enabling the iSCSI initiator to send larger I/O requests.</div>
    </aside>
<p>After the reboot, you can verify that the change has been successfully applied by running the following command:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">esxcli system settings advanced list -o /ISCSI/MaxIoSizeKB
</span></span></code></pre></div><p>The output should look like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">[root@esxnuc1:~] esxcli system settings advanced list -o /ISCSI/MaxIoSizeKB
</span></span><span class="line"><span class="cl">   Path: /ISCSI/MaxIoSizeKB
</span></span><span class="line"><span class="cl">   Type: integer
</span></span><span class="line"><span class="cl">   Int Value: 512
</span></span><span class="line"><span class="cl">   Default Int Value: 128
</span></span><span class="line"><span class="cl">   Min Value: 128
</span></span><span class="line"><span class="cl">   Max Value: 512
</span></span><span class="line"><span class="cl">   String Value: 
</span></span><span class="line"><span class="cl">   Default String Value: 
</span></span><span class="line"><span class="cl">   Valid Characters: 
</span></span><span class="line"><span class="cl">   Description: Maximum Software iSCSI I/O size (in KB) (REQUIRES REBOOT!)
</span></span><span class="line"><span class="cl">   Host Specific: false
</span></span><span class="line"><span class="cl">   Impact: reboot
</span></span></code></pre></div><h2 id="optimize-multipathing">Optimize multipathing</h2>
<p>To optimize performance, I configured Round Robin as the multipathing policy for my iSCSI volumes on the ESXi server. This ensures better load distribution and failover capabilities. The configuration can be applied via the ESXi CLI as follows:</p>
<ul>
<li>List all connected storage devices to identify the target naa or eui identifier:</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">esxcli storage nmp device list
</span></span></code></pre></div><ul>
<li>Output</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">naa.60014058f1117188efe49cb8b5de2273
</span></span><span class="line"><span class="cl">   Device Display Name: LIO-ORG iSCSI Disk (naa.60014058f1117188efe49cb8b5de2273)
</span></span><span class="line"><span class="cl">   Storage Array Type: VMW_SATP_ALUA
</span></span><span class="line"><span class="cl">   Storage Array Type Device Config: {implicit_support=on; explicit_support=on; explicit_allow=on; alua_followover=on; action_OnRetryErrors=on; {TPG_id=0,TPG_state=AO}}
</span></span><span class="line"><span class="cl">   Path Selection Policy: VMW_PSP_MRU
</span></span><span class="line"><span class="cl">   Path Selection Policy Device Config: {policy=iops,iops=1000,bytes=10485760,useANO=0; lastPathIndex=0: NumIOsPending=0,numBytesPending=0}
</span></span><span class="line"><span class="cl">   Path Selection Policy Device Custom Config: policy=iops;iops=1000;bytes=10485760;samplingCycles=16;latencyEvalTime=180000;useANO=0;
</span></span><span class="line"><span class="cl">   Working Paths: vmhba64:C1:T0:L1, vmhba64:C0:T0:L1
</span></span><span class="line"><span class="cl">   Is USB: false
</span></span></code></pre></div>
    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content"><!-- raw HTML omitted --> The default multipath policy is &ldquo;Most Recently Used&rdquo; (VM_PSP_MRU).</div>
    </aside>
<ul>
<li>Set the multipathing policy for the desired iSCSI device to RoundRobin:</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">esxcli storage nmp device set --device &lt;DeviceIdentifier&gt; --psp VMW_PSP_RR
</span></span></code></pre></div><ul>
<li>Verify that the policy has been successfully applied:</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">esxcli storage nmp device list | grep &lt;DeviceIdentifier&gt;
</span></span></code></pre></div>
    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">Replace <code>&lt;DeviceIdentifier&gt;</code> with the actual identifier of your iSCSI device (naa.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx).&quot;</div>
    </aside>
<ul>
<li>Output after chages</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">naa.60014058f1117188efe49cb8b5de2273
</span></span><span class="line"><span class="cl">   Device Display Name: LIO-ORG iSCSI Disk (naa.60014058f1117188efe49cb8b5de2273)
</span></span><span class="line"><span class="cl">   Storage Array Type: VMW_SATP_ALUA
</span></span><span class="line"><span class="cl">   Storage Array Type Device Config: {implicit_support=on; explicit_support=on; explicit_allow=on; alua_followover=on; action_OnRetryErrors=on; {TPG_id=0,TPG_state=AO}}
</span></span><span class="line"><span class="cl">   Path Selection Policy: VMW_PSP_RR
</span></span><span class="line"><span class="cl">   Path Selection Policy Device Config: {policy=iops,iops=1000,bytes=10485760,useANO=0; lastPathIndex=0: NumIOsPending=0,numBytesPending=0}
</span></span><span class="line"><span class="cl">   Path Selection Policy Device Custom Config: policy=iops;iops=1000;bytes=10485760;samplingCycles=16;latencyEvalTime=180000;useANO=0;
</span></span><span class="line"><span class="cl">   Working Paths: vmhba64:C1:T0:L1, vmhba64:C0:T0:L1
</span></span><span class="line"><span class="cl">   Is USB: false
</span></span></code></pre></div>
    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Info</b>
        </div>
        <div class="admonition-content">the multipath policy must now be VMW_PSP_RR</div>
    </aside>
<p>To further optimize path usage and load distribution, I adjusted the IOPS parameter for the Round Robin policy. By default, ESXi switches storage paths after 1000 I/O operations, but I used the following command snippet to change this behavior to switch after every single I/O:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">for i in `esxcfg-scsidevs -c |awk &#39;{print $1}&#39; | grep naa.xxxx`; do 
</span></span><span class="line"><span class="cl">   esxcli storage nmp psp roundrobin deviceconfig set --type=iops --iops=1 --device=$i
</span></span><span class="line"><span class="cl">done
</span></span></code></pre></div><p>Where .xxxx matches the first few characters of your NAA IDs. Reducing the IOPS value from 1000 to 1 means the ESXi host will alternate between available paths much more frequently. In practice, this can help evenly distribute the workload across all paths, potentially improving overall responsiveness and performance.</p>
<p>However, when combined with changes like increasing MaxIoSizeKB, the outcome can vary. In some cases, this adjustment may yield better results, while in others it could degrade performance. Therefore, it’s crucial to test these parameters individually for each storage environment to determine the most effective configuration.</p>
<h2 id="why-jumbo-frames-matter-bonus">Why Jumbo Frames Matter (bonus)</h2>
<p>Jumbo frames allow Ethernet frames larger than the standard 1500 bytes to be transmitted, reducing the total number of frames required to send the same amount of data. This results in lower CPU overhead and better performance, particularly in high-bandwidth and storage-intensive environments. However, it’s essential to ensure that every device in the network path—NICs, switches, and storage systems—supports and is configured for jumbo frames for optimal performance.</p>
<p>To verify that jumbo frames are functioning correctly in your environment, you can use the <em><strong>vmkping</strong></em> command on your ESXi host:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">vmkping -I vmk1 -s 8973 -d 192.168.67.250
</span></span></code></pre></div>
    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b> vmkping parameters</b>
        </div>
        <div class="admonition-content"><ul>
<li>Replace <em><strong>vmk1</strong></em> with the VMkernel adapter used for iSCSI.</li>
<li><em><strong>-s 8972</strong></em> sets the packet size to match the jumbo frame size (8972 bytes, including headers).</li>
<li><em><strong>-d</strong></em> enables the Don&rsquo;t Fragment flag to ensure the packet isn&rsquo;t fragmented along the way.</li>
</ul>
</div>
    </aside>
<p>If you see the error:  <em><strong>sendto() failed (Message too long)</strong></em> this indicates that the packet size is too large to be transmitted without fragmentation. For a setup with an MTU of 9000 configured on the distributed or standard switch, a packet size of 8972 bytes should work correctly. If the error occurs, check your vswitch settings, your physical switches and your iSCSI target.</p>
<h2 id="conclusion">Conclusion</h2>
<p>By increasing the MaxIoSizeKB to 512 KB, verifying jumbo frame functionality with vmkping, and enabling Round Robin, I optimized my iSCSI setup to leverage the full potential of my 2.5G network.</p>
<figure><picture><source srcset="/iscsi-tuning/test_hu_a09ae457807ef1f4.png" type="image/png">
          <img
            src="/iscsi-tuning/test_hu_a09ae457807ef1f4.png"alt="CriytalDiskMark Benchmark"width="479"
            height="351"/>
        </picture><figcaption><p>CriytalDiskMark Benchmark</p></figcaption></figure>
<p>In my tests with CrystalDiskMark, I observed that both network adapters showed significant performance improvements as a result of these optimizations. These adjustments allow my Unraid and iSCSI configuration to deliver a robust and high-performance block storage solution tailored to my workloads.</p>
<h2 id="disclaimer">Disclaimer</h2>
<p>The settings and configurations described in this article are specific to my environment and were tested extensively within my setup. While these adjustments significantly improved performance for my use case, they may not be universally applicable. It’s essential to test these settings in your environment before implementing them, as results may vary depending on hardware, network, and workload specifics. These optimizations are not intended as a blanket recommendation.</p>
]]></content>
		</item>
		
		<item>
			<title>MAC Learning is your friend</title>
			<link>https://sdn-warrior.org/posts/mac-learning/</link>
			<pubDate>Wed, 27 Nov 2024 19:54:18 +0100</pubDate>
			
			<guid>https://sdn-warrior.org/posts/mac-learning/</guid>
			<description><![CDATA[Why you should use MAC Learning]]></description>
			<content type="html"><![CDATA[<h2 id="intro">Intro</h2>
<p>When working with nested ESXi environments, understanding the interplay between MAC Learning, Promiscuous Mode, and Forged Transmits is critical. These settings significantly affect how traffic flows in virtualized networks, especially in scenarios involving virtualized hypervisors or advanced network configurations.</p>
<ul>
<li>
<p>MAC Learning: Think of it as a switch-like behavior in your virtual environment. It optimizes network traffic by ensuring that each virtual machine (VM) receives only the packets meant for its MAC address.
Without MAC learning, when the ESXi VM&rsquo;s vNIC connects to a switch port, it only contains a static MAC address.
<a href="https://docs.vmware.com/en/VMware-vSphere/8.0/vsphere-networking/GUID-E0246B3D-9FB1-4976-8217-5C085863EA9A.html">(more Information about MAC learning)</a></p>
</li>
<li>
<p>Promiscuous Mode: On the other hand, this allows a VM or virtual switch to capture all traffic on a port group, whether addressed to it or not. It&rsquo;s a useful feature for troubleshooting and monitoring but comes with potential security and performance implications.</p>
</li>
<li>
<p>Forged Transmits: Forged Transmits plays a complementary role in this configuration. It ensures that traffic originating from a VM with a source MAC address different from its assigned MAC address is allowed to leave the virtual switch. This is crucial in nested environments.</p>
</li>
</ul>
<h2 id="lab-environment">Lab environment</h2>
<p>In this lab environment, I am using two Minisforum MS-01 workstations, each equipped with ESXi 8.0.3 as the hypervisor. These compact systems provide a balance of performance and energy efficiency, fitting perfectly into my goal of maintaining a powerful yet quiet setup.</p>
<p>Each workstation is interconnected via dual 10 Gb/s network links, ensuring high-speed communication with minimal latency. This setup is particularly advantageous for simulating complex network scenarios and nested virtualization environments.</p>
<p>On each workstation, a nested ESXi host is deployed. These nested hosts act as virtualized hypervisors for a future VCF deployment.</p>
<h2 id="the-problems-ive-caused">The problems I&rsquo;ve caused</h2>
<p>In my previous lab setups, Promiscuous Mode was my go-to solution for nested virtualization. It was reliable, simple to configure, and worked flawlessly for years. While I was aware of the security risks associated with it, in a controlled homelab environment, those risks were not a significant concern.</p>
<p>However, everything changed when I upgraded my lab to dual 10 Gb/s network links and, powered by the i9 CPU, gained the ability to run multiple nested ESXi hosts on a single physical machine.
One of the first challenges I encountered was during the configuration of a vSAN port group for my nested ESXi hosts. This port group was configured to use Active/Active load balancing across both 10 Gb/s uplinks on the MS-01 workstations.
Almost immediately, I noticed unexpected performance issues. Nested VMs were experiencing slow network speeds, and vSAN operations were significantly hindered. Initially, I struggled to pinpoint the root cause. Given my past success with Promiscuous Mode, I didn’t suspect it could be contributing to the problem.</p>
<p><a href="https://cybernils.net/2024/11/26/the-effect-of-using-mac-learning-in-esxi-nested-labs/">This article</a> by my fellow vExpert colleague Nils Kristiansen inspired me to delve deeper into the topic.</p>
<h2 id="why-promiscuous-mode-became-a-problem">Why Promiscuous Mode Became a Problem</h2>
<p>The performance degradation stemmed from how traffic was handled with Promiscuous Mode in a dual-uplink, Active/Active configuration:</p>
<ul>
<li>
<p>Broadcasting Traffic Across Both Uplinks: Promiscuous Mode caused the virtual switch to deliver all traffic to every uplink, regardless of the destination. With two high-speed uplinks in an Active/Active configuration, this created excessive overhead, saturating the uplinks and causing packet drops.</p>
</li>
<li>
<p>vSAN’s High Sensitivity to Latency: vSAN traffic is highly dependent on low latency and consistent performance. The unnecessary broadcast of packets interfered with its ability to operate efficiently.</p>
</li>
<li>
<p>Nested Virtualization Amplified the Problem: Nested ESXi hosts added another layer of complexity. The inner VMs were sending and receiving traffic that the parent ESXi host’s virtual switch struggled to handle efficiently under Promiscuous Mode.</p>
</li>
</ul>
<h2 id="ok-but-how-bad-is-the-performance">OK, but how bad is the performance?</h2>
<p>To quantify the performance issues, I turned to iPerf3, a reliable tool for measuring network throughput that is conveniently included in ESXi 8. Using iPerf3, I conducted a series of tests to better understand the extent of the performance degradation.</p>
<h3 id="performance-measurement-1-both-physical-nics-active-nested-hosts-on-the-same-physical-host">Performance Measurement 1: Both Physical NICs Active, Nested Hosts on the Same Physical Host</h3>
<p>For the first test, I configured both pNICs (10 Gb/s) as active in an Active/Active load balancing setup and placed both nested ESXi hosts on the same physical host. Additionally, Promiscuous Mode was enabled on the port group to ensure traffic could flow properly between the nested hosts.</p>
<p><img src="test1.png" alt="Test 1"></p>
<h3 id="results">Results</h3>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">[ ID] Interval           Transfer     Bitrate         Retr
</span></span><span class="line"><span class="cl">[  5]   0.00-10.00  sec  1.35 GBytes  1.16 Gbits/sec    0            sender  
</span></span><span class="line"><span class="cl">[  5]   0.00-10.00  sec  1.35 GBytes  1.16 Gbits/sec                 receiver 
</span></span></code></pre></div><h3 id="performance-measurement-2-single-nic-active-nested-hosts-on-the-same-physical-host">Performance Measurement 2: Single NIC Active, Nested Hosts on the Same Physical Host</h3>
<p>For the second test, I modified the setup to use only one physical NIC (pNIC) while keeping both nested ESXi hosts on the same physical host. Promiscuous Mode was still enabled on the port group to ensure traffic routing between the nested hosts. By disabling the second uplink, the traffic path was simplified, reducing potential conflicts.</p>
<p><img src="test2.png" alt="Test 2"></p>
<h3 id="results-1">Results</h3>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">[ ID] Interval           Transfer     Bitrate         Retr
</span></span><span class="line"><span class="cl">[  5]   0.00-10.00  sec  11.4 GBytes  9.82 Gbits/sec    0            sender
</span></span><span class="line"><span class="cl">[  5]   0.00-10.01  sec  11.4 GBytes  9.80 Gbits/sec                 receiver
</span></span></code></pre></div><h3 id="performance-measurement-3-mac-learning-and-forged-transmits-dual-uplinks-nested-hosts-on-the-same-physical-host">Performance Measurement 3: MAC Learning and Forged Transmits, Dual Uplinks, Nested Hosts on the Same Physical Host</h3>
<p>For the third test, I switched to using MAC Learning and Forged Transmits, while keeping both physical NICs (pNICs) active in the Active/Active load balancing configuration. Both nested ESXi hosts were still located on the same physical host. This configuration was designed to optimize traffic handling without relying on Promiscuous Mode</p>
<p><img src="test3.png" alt="Test 3"></p>
<h3 id="results-2">Results</h3>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">[ ID] Interval           Transfer     Bitrate         Retr  
</span></span><span class="line"><span class="cl">[  5]   0.00-10.00  sec  24.2 GBytes  20.8 Gbits/sec    0            sender  
</span></span><span class="line"><span class="cl">[  5]   0.00-10.01  sec  24.2 GBytes  20.8 Gbits/sec                 receiver  
</span></span></code></pre></div><h3 id="performance-measurement-4-mac-learning-and-forged-transmits-single-uplink-nested-hosts-on-the-same-physical-host">Performance Measurement 4: MAC Learning and Forged Transmits, Single Uplink, Nested Hosts on the Same Physical Host</h3>
<p>For the fourth test, I used a single uplink (pNIC) with both nested ESXi hosts on the same ESXi server. MAC Learning and Forged Transmits were enabled to optimize traffic handling.
The throughput was 20.7 Gbits/sec, almost identical to Test 3. This confirms that, since the traffic did not need to traverse the physical network infrastructure, the single uplink configuration with MAC Learning and Forged Transmits performed just as efficiently, without the overhead of Promiscuous Mode.</p>
<p><img src="test4.png" alt="Test 4"></p>
<h3 id="results-3">Results</h3>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">[ ID] Interval           Transfer     Bitrate         Retr  
</span></span><span class="line"><span class="cl">[  5]   0.00-10.00  sec  24.2 GBytes  20.8 Gbits/sec    0            sender  
</span></span><span class="line"><span class="cl">[  5]   0.00-10.01  sec  24.2 GBytes  20.8 Gbits/sec                 receiver  
</span></span></code></pre></div><h3 id="further-performance-measurements-and-security-considerations">Further Performance Measurements and Security Considerations</h3>
<p>Additional performance tests revealed that the difference between Promiscuous Mode and MAC Learning was minimal or even non-existent when the nested hosts were placed on two different physical hosts.
The traffic between the nested VMs did not significantly differ whether Promiscuous Mode or MAC Learning was enabled, indicating that both configurations performed similarly in a multi-host environment.</p>

    <aside class="admonition warning">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-alert-circle">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="8" x2="12" y2="12"></line>
      <line x1="12" y1="16" x2="12.01" y2="16"></line>
   </svg></div><b>It&#39;s important to note the security implications of using Promiscuous Mode</b>
        </div>
        <div class="admonition-content">Enabling Promiscuous Mode on a network interface allows all traffic to be sent to the VM, even traffic not intended for it, which can expose sensitive data or potentially allow malicious activity.
Because of this security concern, Promiscuous Mode should only be used temporarily, and for troubleshooting purposes, in production environments.
It is recommended to disable it once the issue is resolved to maintain a secure network setup.</div>
    </aside>
<h3 id="side-effect-of-promiscuous-mode-duplicate-packets">Side Effect of Promiscuous Mode: Duplicate Packets</h3>
<p>Enabling Promiscuous Mode on a network interface can lead to duplicate packets when both the source and destination are on the same ESXi host. In this mode, the virtual machine receives all traffic, including its own outbound packets, causing unnecessary duplication. This can result in performance degradation due to increased CPU usage and network inefficiencies.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">[root@esxnuc04:/usr/lib/vmware/vsan/bin] vmkping -I vmk1 192.168.69.203
</span></span><span class="line"><span class="cl">PING 192.168.69.203 (192.168.69.203): 56 data bytes
</span></span><span class="line"><span class="cl">64 bytes from 192.168.69.203: icmp_seq=0 ttl=64 time=0.356 ms
</span></span><span class="line"><span class="cl">64 bytes from 192.168.69.203: icmp_seq=0 ttl=64 time=0.423 ms (DUP!)
</span></span><span class="line"><span class="cl">64 bytes from 192.168.69.203: icmp_seq=0 ttl=64 time=0.426 ms (DUP!)
</span></span><span class="line"><span class="cl">64 bytes from 192.168.69.203: icmp_seq=0 ttl=64 time=0.429 ms (DUP!)
</span></span><span class="line"><span class="cl">64 bytes from 192.168.69.203: icmp_seq=1 ttl=64 time=0.249 ms
</span></span><span class="line"><span class="cl">64 bytes from 192.168.69.203: icmp_seq=1 ttl=64 time=0.274 ms (DUP!)
</span></span><span class="line"><span class="cl">64 bytes from 192.168.69.203: icmp_seq=1 ttl=64 time=0.277 ms (DUP!)
</span></span><span class="line"><span class="cl">64 bytes from 192.168.69.203: icmp_seq=1 ttl=64 time=0.281 ms (DUP!)
</span></span><span class="line"><span class="cl">64 bytes from 192.168.69.203: icmp_seq=2 ttl=64 time=0.261 ms
</span></span></code></pre></div><h2 id="sidequest-using-iperf-on-esxi-803">Sidequest: Using iPerf on ESXi 8.0.3</h2>
<p>To use iPerf for network performance testing on ESXi 8.0.3, you&rsquo;ll need to follow a few steps to enable and configure the necessary settings.</p>
<ul>
<li>Step 1: Disable the ESXi firewall temporarily
First, disable the ESXi firewall to allow the iPerf tool to operate without restrictions:</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">esxcli network firewall set --enabled false
</span></span></code></pre></div><ul>
<li>Step 2: Allow executing iPerf
Next, set the system to allow execution of non-installed binaries (such as iPerf), which are not part of the default ESXi installation:</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">localcli system settings advanced set -o /User/execInstalledOnly -i 0
</span></span></code></pre></div><ul>
<li>Step 3: Execute iPerf (Client example)
Once you&rsquo;ve set the necessary configuration, you can execute iPerf to test the network performance. Use the following command to run iPerf as a client (-c) and specify the target IP address (e.g., 192.168.69.203):</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">./usr/lib/vmware/vsan/bin/iperf3 -c 192.168.69.203
</span></span></code></pre></div><ul>
<li>Step 4: Re-enable the firewall
Once you’ve finished testing, remember to re-enable the firewall for security reasons:</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">esxcli network firewall set --enabled true
</span></span></code></pre></div><ul>
<li>Step 5: Restrict execution of non-installed binaries
To revert the system to its default behavior and restrict the execution of non-installed binaries, run the following command:</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">localcli system settings advanced set -o /User/execInstalledOnly -i 1
</span></span></code></pre></div><h2 id="why-mac-learning-and-forged-transmits-replace-promiscuous-mode-in-nested-environments">Why MAC Learning and Forged Transmits Replace Promiscuous Mode in Nested Environments</h2>
<p>In a typical nested ESXi environment, each inner VM sends packets with its unique MAC address, which the virtual switch on the parent ESXi host does not recognize by default. This creates a challenge because:</p>
<ul>
<li>Without Promiscuous Mode, the switch drops packets destined for or originating from these MAC addresses.</li>
<li>Without Forged Transmits, packets from inner VMs with &ldquo;forged&rdquo; source MAC addresses are also dropped.</li>
</ul>
<h3 id="by-enabling-mac-learning-and-forged-transmits">By enabling MAC Learning and Forged Transmits:</h3>
<p><em><strong>MAC Learning</strong></em> ensures that the virtual switch learns the inner VMs’ MAC addresses dynamically, so it can correctly forward traffic to them without requiring Promiscuous Mode.
<em><strong>Forged Transmits</strong></em> ensures that traffic from inner VMs with different source MAC addresses is allowed to leave the parent VM&rsquo;s vNIC.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Conclusion</b>
        </div>
        <div class="admonition-content">The combination of MAC Learning and Forged Transits removes the need for promiscuous mode, while maintaining:
Better performance, as traffic is only sent where needed.
Stronger security, since traffic is not broadcast unnecessarily.</div>
    </aside>
<p>MAC Learning with Forged Transmits is a significant performance gamechanger, especially when running multiple nested VMs on a single physical ESXi host.
However, it&rsquo;s important to note that MAC Learning with Forged Transmits requires a Distributed Switch. If you&rsquo;re using a Standard Switch, you&rsquo;ll still need to rely on Promiscuous Mode to achieve similar functionality.</p>
]]></content>
		</item>
		
		<item>
			<title>Unraid - A Storage Journey</title>
			<link>https://sdn-warrior.org/posts/unraid-storage/</link>
			<pubDate>Tue, 19 Nov 2024 23:00:00 +0100</pubDate>
			
			<guid>https://sdn-warrior.org/posts/unraid-storage/</guid>
			<description><![CDATA[How i use Unraid]]></description>
			<content type="html"><![CDATA[<h2 id="my-custom-unraid-storage-build---flexibility-simplicity-and-future-proofing">My Custom Unraid Storage Build - Flexibility, Simplicity, and Future-Proofing</h2>
<p>As a passionate homelaber, I enjoy not only using technology but also understanding and customizing it to suit my needs. My Unraid storage system is one of my longest-running projects, continuously evolving to meet the demands of my homelab.</p>
<p>After thorough research, I decided to go with Unraid – an operating system renowned for its simplicity, flexibility, and scalability. These key features were the deciding factors for me:</p>
<ul>
<li>Easy Expansion: Unraid allows combining drives of different sizes and expanding the array later without replacing all disks at once.</li>
<li>Docker Integration: The ability to run Docker containers directly on Unraid unlocks immense potential for personal projects and applications.</li>
<li>Versatility: Whether it’s managing data, running a media server, or hosting virtual machines, Unraid offers the freedom to adapt the system to your needs.</li>
</ul>
<p>In this blog post, I’ll share my experience and guide you through how I’ve planned, built, and continuously improved my Unraid storage system. It’s a perfect solution for anyone seeking a scalable, cost-effective setup without sacrificing performance or ease of use.</p>
<h2 id="the-beginning-my-first-steps-with-unraid">The Beginning: My First Steps with Unraid</h2>
<p>Unraid is a commercial product that initially caught my attention due to its unique approach to storage management. Historically, Unraid licenses were available for a one-time purchase, providing lifetime access to its features. Today, however, users can choose between a subscription model or a lifetime license, offering flexibility depending on individual needs.</p>
<p>One of the standout features of Unraid is that it boots directly from a USB stick. This design choice not only simplifies installation but also makes it incredibly easy to replace hardware. Simply move the USB stick to a new machine, and the system is ready to run without the need for extensive reconfiguration.</p>
<p>My first Unraid “server” was far from conventional: a Lenovo notebook powered by an old Intel Dual-Core processor. To build my initial array, I used external USB disks – a true makeshift setup but perfect for testing the waters. For three weeks, this setup served as my proof of concept (POC), allowing me to explore Unraid’s capabilities and ensure it met my needs before committing to more suitable hardware.</p>
<p>This early experimentation confirmed that Unraid was the right choice for my homelab, and I soon began planning and acquiring the components for my first proper build.</p>
<h2 id="building-a-3-tier-performance-storage-system">Building a 3-Tier Performance Storage System</h2>
<p>As my Unraid setup evolved, I implemented a 3-tier performance storage system to meet the varying demands of my homelab. Each tier is tailored for a specific purpose, optimizing the balance between speed, capacity, and efficiency:</p>
<ul>
<li>Tier 1: The Unraid Array (Slow Storage)
At the foundation of my storage system is the Unraid Array, which serves as the slowest but most capacious tier. Unlike traditional RAID, an Unraid Array does not stripe data across disks. Instead, each disk holds individual files, while parity disks provide fault tolerance for data recovery. This unique design allows mixing drives of different sizes, making upgrades straightforward and cost-effective.
My Unraid Array is hosted in an external USB 3.2 storage shelf, which presents each drive individually to Unraid. The shelf delivers enough bandwidth to operate all six 6TB enterprise SATA drives at full speed, ensuring reliable performance even during intensive data access.</li>
</ul>

    <aside class="admonition warning">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-alert-circle">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="8" x2="12" y2="12"></line>
      <line x1="12" y1="16" x2="12.01" y2="16"></line>
   </svg></div><b>A Quick Warning About Using Unraid with USB Disk Shelves</b>
        </div>
        <div class="admonition-content"><p>It’s important to note that Unraid does not officially recommend running the array on a USB disk shelf, as USB connections can sometimes lead to instability or performance issues. However, my personal experience has shown that it can work reliably with the right hardware.</p>
<p>In my setup, I use a TerraMaster D6-320 USB 3.2 disk shelf, paired with a high-quality USB controller like those found in devices such as Intel NUCs. This combination has proven stable and capable of delivering full-speed performance for all six enterprise SATA drives in my array.
While this setup works well for me, I recommend testing thoroughly in your environment to ensure stability and compatibility before committing to a similar configuration.</p>
</div>
    </aside>
<ul>
<li>
<p>Tier 2: Consumer NVMe Drives (Fast Cache and Docker Storage)
The second performance tier consists of consumer-grade NVMe drives, configured in a btrfs pool within Unraid. This configuration not only allows advanced features like snapshots but also supports RAID levels within the pool, providing a balance between performance and redundancy.
This tier is designed to handle tasks requiring high-speed I/O, such as hosting Docker containers. Keeping Docker data on the NVMe tier ensures that the Unraid Array doesn’t need to spin up unnecessarily, prolonging the life of the disks and improving system responsiveness.
The NVMe drives also serve as a fast cache for incoming data. Files uploaded to the NAS during the day are stored on the NVMe tier and then moved to the slower Unraid Array overnight—except for Docker data, which always remains on the NVMe storage to maintain optimal performance.
Unraid’s flexibility makes it easy to decide whether specific shares or data should stay on the NVMe pool or be automatically moved to the Array on a scheduled basis. This level of control ensures you can optimize storage placement to suit your workload, balancing speed and capacity seamlessly.</p>
</li>
<li>
<p>Tier 3: Enterprise NVMe via iSCSI (Fast and Durable Storage)
The top tier features a 4TB enterprise NVMe drive, designed for high-speed and durable performance under constant load. This storage tier is shared with my homelab servers via iSCSI Multichannel, utilizing a 2x10Gb Intel X710 NIC for redundancy and maximum throughput.
This tier provides fast, reliable storage for workloads that demand consistent performance, such as virtual machines or other critical applications in my homelab. By leveraging enterprise-grade hardware and robust networking, this storage layer ensures low-latency access and can handle the demands of continuous use without compromising reliability.</p>
</li>
</ul>
<h2 id="current-setup">Current Setup</h2>
<p>My Unraid server is built on a Intel NUC Extreme 11th Gen i7 with 64GB of RAM, offering a powerful and compact platform for my homelab. The storage setup includes 2x 1TB consumer-grade NVMe drives for fast cache and 4TB enterprise-grade NVMe for ultra-reliable, high-performance storage.</p>
<p>The Unraid Array has a total capacity of 42TB, with 33.4TB usable for data storage. This provides ample space for both my active projects and long-term storage needs.</p>
<p>On the software side, I host 29 Docker container services and 4 virtual machines, including critical services such as my Active Directory (AD), Certificate Authority (CA), and a Veeam Proxy for file backups. This setup allows for a highly efficient and flexible environment that supports a wide range of use cases while maintaining reliable performance.</p>
<p><img src="unraid2.jpg" alt="Gui"></p>
<h2 id="performance">Performance</h2>
<p>The performance of my Unraid setup was measured using CrystalDiskMark with a 16GB test file (on a Windows 11 VM) to evaluate both sequential and random read and write speeds, as well as IOPS (Input/Output Operations Per Second) of my iSCSI Storage (Tier 3). The results highlight the impressive capabilities of the system:</p>
<p>Read Performance:</p>
<ul>
<li>Sequential Read (Q8T1): 1.993 GB/s | IOPS: 1900.35</li>
<li>Sequential Read (Q1T1): 0.782 GB/s | IOPS: 746.21</li>
<li>Random Read 4K (Q32T1): 0.322 GB/s | IOPS: 78,651.61</li>
<li>Random Read 4K (Q1T1): 0.021 GB/s | IOPS: 5,149.17</li>
</ul>
<p>Write Performance:</p>
<ul>
<li>Sequential Write (Q8T1): 1.247 GB/s | IOPS: 1,189.37</li>
<li>Sequential Write (Q1T1): 0.802 GB/s | IOPS: 764.48</li>
<li>Random Write 4K (Q32T1): 0.298 GB/s | IOPS: 72,776.61</li>
<li>Random Write 4K (Q1T1): 0.036 GB/s | IOPS: 8,835.45</li>
</ul>
<p>These performance metrics demonstrate both the high throughput and responsiveness of the NVMe storage.
The sequential read and write speeds are excellent for large file transfers, while the random IOPS (especially at Q32T1) indicate the drive’s ability to handle a high volume of small, random data requests.
Despite the lower random read/write speeds at Q1T1, the system still shows strong overall performance for a variety of tasks.</p>
<h2 id="understanding-the-crystaldiskmark-test-parameters">Understanding the CrystalDiskMark Test Parameters</h2>
<p>In CrystalDiskMark, several key parameters define how the storage device is tested. Here’s a breakdown of what each test represents:</p>
<p>Q8T1: This stands for Queue Depth 8, Thread 1. It simulates a scenario where 8 data requests are queued up, but only 1 thread (or process) is handling those requests. This type of test is useful for measuring the performance of the storage device when handling multiple sequential tasks at once, but not with excessive parallelism.</p>
<p>Q1T1: This stands for Queue Depth 1, Thread 1. Here, only 1 data request is in the queue, and a single thread handles it. This test represents the performance when a single request is being processed at a time, simulating typical user scenarios where one operation is occurring without significant multitasking.</p>
<p>Q32T1: This stands for Queue Depth 32, Thread 1. In this case, there are 32 queued data requests with a single thread handling them. This test simulates heavy workloads where many data requests are pending, but only one thread is processing them. It can show how the device handles stress under larger, more sustained read operations.</p>
<h3 id="sequential-vs-random-read-tests">Sequential vs. Random Read Tests</h3>
<p>Sequential Read: This test measures how fast the storage device can read large, contiguous chunks of data, like streaming a video or transferring large files. It simulates real-world scenarios where large files need to be read from the storage at a steady rate.</p>
<p>Sequential Read (Q8T1): 1.993 GB/s – This high performance indicates the drive can handle multiple large file read operations quickly, with 8 data requests queued up.
Sequential Read (Q1T1): 0.782 GB/s – This is slower than the Q8T1 test because only one request is processed at a time, simulating less intensive operations.</p>
<p>Random Read: This test measures the performance when the drive has to read small, non-contiguous chunks of data from different parts of the storage. This type of test is more representative of workloads like database operations or running small applications that frequently access different data blocks.</p>
<p>Random Read 4K (Q32T1): 0.322 GB/s – With 32 queued requests and 4KB blocks, this performance shows how the system handles multiple random reads.
Random Read 4K (Q1T1): 0.021 GB/s – Here, only one small request is being handled at a time, leading to slower performance because random 4K reads are typically slower due to the overhead of accessing many different locations on the disk.</p>
<p>These tests give a complete picture of the drive&rsquo;s performance under different scenarios: from high-speed, sequential reads (large files) to more intensive, random access (small files), and with varying levels of workload concurrency.</p>
<h2 id="bom-bill-of-materials">BOM (Bill of Materials)</h2>
<ul>
<li>NUC11DBBi7 , Version M17027-404</li>
<li>2x 32 GB RAM Kingston KF3200C20S4 SODIMM DDR4 Synchronous 3200 MHz</li>
<li>TerraMaster D6-320 USB 3.2(Gen2)</li>
<li>3x TOSHIBA_MG09ACA18TE 18 TB Enterprise SATA</li>
<li>1x TOSHIBA_MG08ADA600E 6 TB Enterpise SATA (to change)</li>
<li>2x Samsung 970 EVO Plus 1TB</li>
<li>1x Samsung MZ1L23T8HBLA-00A07 4 TB Enterprise NVMe 110mm</li>
<li>1x Intel X710 2x 10 Gb/s</li>
<li>1x Good USB Stick (32GB)</li>
</ul>
<p><img src="das.jpg" alt="DAS Disk Array">
<img src="unraid.jpg" alt="Unraid"></p>
<h3 id="fun-fact-my-unraid-server-has-underglow-lighting">Fun fact: My Unraid server has underglow lighting!</h3>
]]></content>
		</item>
		
		<item>
			<title>How to get most out of your Nuc </title>
			<link>https://sdn-warrior.org/posts/nuc/</link>
			<pubDate>Sun, 17 Nov 2024 11:57:43 +0100</pubDate>
			
			<guid>https://sdn-warrior.org/posts/nuc/</guid>
			<description><![CDATA[Performance tuning for NUCs]]></description>
			<content type="html"><![CDATA[<h2 id="first-things-first">First things first</h2>
<p>Get a second NIC. The Intel NUC Pro has an IO expansion and supports an additional NIC.
Unfortunately, these are relatively difficult to get in Germany, but it&rsquo;s worth the effort.

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Search for</b>
        </div>
        <div class="admonition-content">ASUS NUC LAN and USB Expansion Module (90AR0000-P00010)</div>
    </aside></p>
<p><figure><picture><source srcset="/NIC_hu_b3293dbb6f8de30e.webp" type="image/webp">
          <img
            src="/NIC_hu_b3293dbb6f8de30e.webp"alt="Image of an IO expansion"width="1200"
            height="800"/>
        </picture><figcaption><p>IO expansion</p></figcaption></figure>
vSphere 8 supports the cards natively and you don&rsquo;t have to install any drivers.
It also supports jumbo frames, which is relevant for NSX Labs.
It is recommended to use a 2.5 GB managed switch. I am using a Mikrotik with the wonderful name <code>CRS326-4C +20G+2Q</code>.</p>

    <aside class="admonition tip">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-sun">
      <circle cx="12" cy="12" r="5"></circle>
      <line x1="12" y1="1" x2="12" y2="3"></line>
      <line x1="12" y1="21" x2="12" y2="23"></line>
      <line x1="4.22" y1="4.22" x2="5.64" y2="5.64"></line>
      <line x1="18.36" y1="18.36" x2="19.78" y2="19.78"></line>
      <line x1="1" y1="12" x2="3" y2="12"></line>
      <line x1="21" y1="12" x2="23" y2="12"></line>
      <line x1="4.22" y1="19.78" x2="5.64" y2="18.36"></line>
      <line x1="18.36" y1="5.64" x2="19.78" y2="4.22"></line>
   </svg></div><b>Practical tip</b>
        </div>
        <div class="admonition-content">My experience with 2.5 Gb/s Lan has shown that it makes sense to set the ports to a fixed speed in the hypervisor and on the switch, otherwise I kept having network failures.</div>
    </aside>
<h2 id="memory-tiering">Memory Tiering</h2>
<p>Memory Tiering is very new in ESXi vSphere 8.0U3 and is still a Tech Preview.
With memory tiering, you can use up to 400% of the physical RAM. This requires a fast NVMe.
I would recommend a PCIe4 NVMe with at least 5000 MB/s read/write.
Memory Tiering stores very cold (unused RAM pages) and cold RAM pages (less than 20% used) on the NVMe (Memory Tier).
There is a wonderful <a href="https://www.vmware.com/explore/video-library/video/6360757998112" title="Explore USA">Explore Session</a> on this.</p>
<p>To enable memory tiering, you have to enter the following commands via the ESX Cli:</p>
<ul>
<li>This command turns on memory tiering</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">esxcli system settings kernel set -s MemoryTiering -v TRUE
</span></span></code></pre></div><ul>
<li>This command selects the NVMe</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">esxcli system tierdevice create -d /vmfs/devices/disks/&lt;Your NVME&gt;
</span></span></code></pre></div><ul>
<li>Enter the factor here (0-400%).</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">esxcli system settings advanced set -o /Mem/TierNvmePct -i 400
</span></span></code></pre></div><p>After a reboot, you have the selected amount of additional memory.</p>

    <aside class="admonition warning">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-alert-circle">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="8" x2="12" y2="12"></line>
      <line x1="12" y1="16" x2="12.01" y2="16"></line>
   </svg></div><b>Attention</b>
        </div>
        <div class="admonition-content">The selected disk is no longer available to the ESXi.
The minimum capacity must match the selected factor.
If the disk is larger, it will still be used entirely for memory tiering.
My recommendation is to use 1 TB NVMe with 64 GB of physical RAM and 400% as the factor.
ESXi will use the NVMe evenly so that the disk doesn&rsquo;t break as quickly.</div>
    </aside>
<h2 id="using-pe-cores">Using P/E Cores</h2>
<p>Intel has introduced the big.little CPU architecture from the 12th generation of their consumer CPUs. This leads to some problems with ESXi. If the efficiency cores are activated, the ESXi starts with a PSOD (Purble Screen of Death).
Fortunately, there are a few workarounds here.</p>
<ul>
<li>Disable the E cores in the BIOS</li>
</ul>
<p>This means that you can use hyperthreading and the P Cores. However, you are clearly wasting potential here. That&rsquo;s why we don&rsquo;t want to.</p>
<ul>
<li>Use P and E cores and sacrifice hyperthreading for them</li>
</ul>
<p>My tests showed that I got significantly more performance out of my 13th generation i7 if I didn&rsquo;t use hyperthreading and only used “real” CPU cores, even if the E cores have a lower clock rate.
<a href="https://williamlam.com/2023/01/video-of-esxi-install-workaround-for-fatal-cpu-mismatch-on-feature-for-intel-12th-gen-cpus-and-newer.html">William Lam</a> has written very detailed blog articles about this, I link to him here for more information, as this article was actually only intended to be a short summary.</p>
<p>We actually only need two ESX CLI commands to make it all work.</p>
<ul>
<li>With this command, we prevent the PSOD from occurring when the ESXi boots.</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">esxcli system settings kernel set -s cpuUniformityHardCheckPanic -v FALSE
</span></span></code></pre></div><ul>
<li>With this command, we prevent the ESXi from getting a PSOD when the VMs are switched on.</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">esxcli system settings kernel set -s ignoreMsrFaults -v TRUE
</span></span></code></pre></div>
    <aside class="admonition tip">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-sun">
      <circle cx="12" cy="12" r="5"></circle>
      <line x1="12" y1="1" x2="12" y2="3"></line>
      <line x1="12" y1="21" x2="12" y2="23"></line>
      <line x1="4.22" y1="4.22" x2="5.64" y2="5.64"></line>
      <line x1="18.36" y1="18.36" x2="19.78" y2="19.78"></line>
      <line x1="1" y1="12" x2="3" y2="12"></line>
      <line x1="21" y1="12" x2="23" y2="12"></line>
      <line x1="4.22" y1="19.78" x2="5.64" y2="18.36"></line>
      <line x1="18.36" y1="5.64" x2="19.78" y2="4.22"></line>
   </svg></div><b>Practical tip</b>
        </div>
        <div class="admonition-content">When reinstalling an ESXi server, I always switch off the E Cores, which saves me from having to manipulate the boot loader. After I have allowed memory tiering and the E/P Cores via the ESX CLI, I switch the E/P Cores back on in the BIOS.</div>
    </aside>
<p>If everything is correct, an ESX NUC of the 13th generation looks like this.
<figure><picture><source srcset="/nuc_hu_21e9fb84617f65c0.jpg" type="image/jpeg">
          <img
            src="/nuc_hu_21e9fb84617f65c0.jpg"alt="NUC i7"width="1098"
            height="458"/>
        </picture><figcaption><p>NUC i7 13th Gen with Memory Tiering and P/E Cores</p></figcaption></figure></p>
]]></content>
		</item>
		
		<item>
			<title>Homelab V4</title>
			<link>https://sdn-warrior.org/posts/labv4/</link>
			<pubDate>Sat, 16 Nov 2024 20:00:00 +0000</pubDate>
			
			<guid>https://sdn-warrior.org/posts/labv4/</guid>
			<description><![CDATA[Homelab v4]]></description>
			<content type="html"><![CDATA[<h2 id="ready-for-vcf">Ready for VCF</h2>
<p>I have done a huge redesign of my Homelab.
To better test VCF scenarios, 3 new Minisforum MS-01 have been added.
These have a 13th generation i9 and are equipped with fast NVMes for memory tiering.
They also have 2x10G and 2x2.5G networking on board for various VM workloads.
Furthermore, I converted my storage from NFS to iSCSI with multipathing, which gets even more performance out of my self-built Unraid.
I manage about 2 GB/s read / 1.2 GB GB/s write and 78K IOPS (Random 4K with 32Q) in a Windows 11 VM.</p>
<figure><picture><source srcset="/bench1_hu_c7391d1663624761.jpg" type="image/jpeg">
          <img
            src="/bench1_hu_c7391d1663624761.jpg"alt="Disk Performance iSCSI Multipathing"width="483"
            height="351"/>
        </picture><figcaption><p>Disk Performance iSCSI Multipathing</p></figcaption></figure>
<figure><picture><source srcset="/bench2_hu_e20ae71514f46813.jpg" type="image/jpeg">
          <img
            src="/bench2_hu_e20ae71514f46813.jpg"alt="IOPS iSCSI Multipathing"width="483"
            height="356"/>
        </picture><figcaption><p>IOPS iSCSI Multipathing</p></figcaption></figure>
<p>Pretty impressive for my setup. I still have to customize the rack a bit so that I can add the 10G Mikrotik switch and clean up the VLANs from old labs.
I&rsquo;m already planning a further expansion stage though.\</p>
]]></content>
		</item>
		
		<item>
			<title>NSX Integration Fortigate</title>
			<link>https://sdn-warrior.org/posts/nsx-integration-fortigate/</link>
			<pubDate>Mon, 26 Aug 2024 19:49:23 +0100</pubDate>
			
			<guid>https://sdn-warrior.org/posts/nsx-integration-fortigate/</guid>
			<description><![CDATA[NSX and Fortigate]]></description>
			<content type="html"><![CDATA[<h2 id="nsx-integration-for-fortinet-fortigate-firewalls">NSX integration for Fortinet Fortigate Firewalls</h2>
<p>Modern SDN solutions are flexible, fast and effective. The rules of the classic perimeter firewall should be exactly the same. To make life easier, Fortinet has an NSX integration that allows us to write our perimeter rules to dynamic NSX groups.</p>
<h2 id="first-things-first">First things first</h2>
<p>The Fortinet NSX integration works via a so-called external connector. For this purpose, the Fortigate contacts the NSX Manager at regular intervals and updates the previously imported groups.
This allows us to use dynamic groups that were previously created in NSX using tags, for example.</p>
<p>First we need to configure our connector. To do this, go to the Fortigate at <em><strong>Security Fabric / External Connectors</strong></em> and click on <em><strong>Create New</strong></em>.</p>
<p><img src="01.webp" alt="Fortigate Dialog"></p>
<p>Here we need to enter our NSX Manager, if we have an NSX Manager Cluster then of course the Cluster VIP or FQDN is needed.
We can define an update interval, this determines how long it takes to update the groups on the Fortigate.
In my lab I chose 30 seconds, depending on the environment lower or higher values may make sense. In a productive environment, the certificate should always be verified.
In my homelab environment I deliberately turned this off.</p>
<p><img src="02_External-Connector.webp" alt="External Connector"></p>
<h2 id="importing-the-dynamic-nsx-groups">Importing the dynamic NSX groups</h2>
<p>The groups need to be imported via the Fortigate CLI. This is relatively easy to do for all groups and specifically for individual groups.
Groups imported this way will be automatically updated in the future. If new groups are configured in NSX, they must be imported via the CLI if they play a role in the rules.</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Fortigate CLI</b>
        </div>
        <div class="admonition-content"><code>execute nsx group import &lt;SDN Connector&gt; &lt;VDOM&gt; &lt;group&gt;</code></div>
    </aside>
<p>If you want to import all NSX groups, you need to omit the group name in the CLI call. In the screenshot you can see me importing the <em><strong>dFG_AlpineLinux</strong></em> NSX group.
This uses an NSX tag to combine all VMs of type Alpine Linux into one security group.</p>
<p><img src="03_Group-Import.png" alt="Group-Import"></p>
<p>In the Fortigate, you can now find the group under <em><strong>Policy &amp; Objects / Addresses</strong></em> and use it like any other group in firewall policies. The NSX groups can be used not only for firewall rules, but also for policy-based routing via the SD-WAN feature.</p>
<p><img src="04_FW-Groups.webp" alt="Firewall Groups"></p>
<p>In my example, I am prohibiting the Alpine Linux VMs from accessing the Internet. The current realised group assignment can be checked at any time via <em><strong>Policy &amp; Objects&gt; / Addresses</strong></em> and a double click on the group.</p>
<p><img src="05_matched-adress.webp" alt="Matched Adrewss"></p>
<h2 id="delete-groups">Delete groups</h2>
<p>Groups need to be deleted manually. The easiest way to do this is via the Fortigate CLI. To do this, execute the following CLI command:</p>

    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Fortigate CLI</b>
        </div>
        <div class="admonition-content"><code>execute nsx group delete &lt;VDOM&gt; &lt;filter&gt;</code></div>
    </aside>
<p>If you want to delete all groups, you can simply leave the filter empty. If a group is used in a firewall policy, it cannot be deleted and you will receive a message that the group is in use.</p>
<h2 id="testing-the-solution">Testing the solution</h2>
<p>To do this, I log on to the Alpine2 VM and check the current IP. The VM has currently been assigned 172.31.2.10. We can also find this on the Fortigate in our dFG_AlpineLinux group. I am trying to send an ICMP to the Internet, which is blocked by the Fortigate firewall as expected.</p>
<p><img src="06_test-1.webp" alt="First Test"></p>
<p>Next, I remove the Alpine Linux tag in the NSX, which ensures that the VM is no longer realised in the dFG_Alpine Linux group on the Fortigate after 30 seconds at the latest.</p>
<p><img src="07_test-2.webp" alt="Second Test"></p>
<p>Finally, I repeated my ping test. As expected, Internet access is now without any problems.</p>
<p><img src="08_icmp.png" alt="Test Number three"></p>
<h2 id="remarks">Remarks</h2>
<p>If the connection to NSX Manager is interrupted, group membership remains at the last synchronised state. This means that in highly dynamic environments, too much or too little traffic may be allowed or blocked. For this reason, the SDN connection should always be monitored. All group changes are saved in the Log SDN Connector Log of the Fortigate.</p>
<h2 id="use-cases">Use cases</h2>
<p>One conceivable scenario would be to enable a dynamic firewall for VMs that are allowed to access the Internet. This can be done in NSX using a tag and a group. Every VM that does not have a tag and is therefore not in the group will be blocked at the Fortigate perimeter firewall.</p>
<p><img src="09_firewall_rule.webp" alt="Firewall Rules"></p>
<p>The firewall rule allows everything that does not go into RFC1918 networks (private IP range). Of course, this is only a simple example and more complex setups are possible.</p>
<h2 id="additional-information">Additional information</h2>
<p><a href="https://docs.fortinet.com/document/fortigate/7.4.4/administration-guide/753961/public-and-private-sdn-connectors">Fortinet Documentation: Public and private SDN connectors</a></p>
]]></content>
		</item>
		
		<item>
			<title>NSX Identity Firewall – A Case Study With the Flavour VDI</title>
			<link>https://sdn-warrior.org/posts/nsx-idfw-vdi/</link>
			<pubDate>Fri, 02 Aug 2024 08:34:23 +0100</pubDate>
			
			<guid>https://sdn-warrior.org/posts/nsx-idfw-vdi/</guid>
			<description><![CDATA[IDFW with NSX and Windows Clients]]></description>
			<content type="html"><![CDATA[
    <aside class="admonition info">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor"
      stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-info">
      <circle cx="12" cy="12" r="10"></circle>
      <line x1="12" y1="16" x2="12" y2="12"></line>
      <line x1="12" y1="8" x2="12.01" y2="8"></line>
   </svg></div><b>Disclaimer</b>
        </div>
        <div class="admonition-content">There are of course other ways of using the Identity Firewall. This is the way I have used it with my customers so far. Of course, the whole thing is colored by personal preferences, experiences and customer requirements so take this as an idea for your own implementations.</div>
    </aside>
<h2 id="intro">Intro</h2>
<p>A customer of mine has the use case that his entire environment must be micro-segmented. Of course, this does not stop at the VDI environment. Since my customer uses non-persistent VDIs (the VMs are deleted after each logoff), no tags can be used for the security groups. After deleting the VM, the tag would also be removed and a new VDI would have no tags and would thus be isolated. This can be solved with automation or with the Identity Firewall (a combination of both solutions is also conceivable and makes sense). In the first step, my customer opted for the Identity Firewall because it allows generic VDIs to be used and authorisations to resources can be conveniently controlled via the customer AD. In addition, each user/user group receives individual firewall rules, which corresponds to the need-to-know principle.</p>
<h3 id="you-can-use-the-identity-firewall-in-2-ways">You can use the Identity Firewall in 2 ways:</h3>
<ul>
<li>Variant 1 would be with VMware Tools and Guest Introspection.</li>
<li>Variant 2 would be with log scraping and, for example, Aria Operations for Logs as a log server.</li>
</ul>
<p>I use variant 1 because the implementation means less effort in my customer setup and we only want to protect the VDIs with Idendity Firewall.</p>
<h3 id="limitations">Limitations</h3>
<ul>
<li>No User /Group ID Support for Federation.</li>
<li>No direct integration with VDI and RDSH.</li>
<li>User-ID based rules are supported for only firewall rules.</li>
<li>No User-ID based policy for IDS/IPS and TLS Inspection.</li>
<li>Multi-User (RDSH) does not support Server Message Block (SMP) protocol.</li>
</ul>
<h3 id="supported-protocols">Supported protocols</h3>
<ul>
<li>Single user (VDI, or Non-RDSH Server) use case support – TCP, UDP</li>
<li>Multi-User (RDSH) use case support – TCP, UDP</li>
</ul>
<h3 id="supported-clients">Supported clients</h3>
<ul>
<li>Windows 8,10,11</li>
<li>Windows Server 2012 – 2022</li>
</ul>
<h3 id="what-is-needed">What is needed?</h3>
<ul>
<li>NSX in the VDI cluster</li>
<li>AD infrastructure</li>
<li>VMware tools</li>
<li>VMware Aria Operations for Logs (optional)</li>
</ul>
<h2 id="first-things-first">First things first</h2>
<p>Identity based groups can only be used as a source for a firewall rule. In addition, the rules only come into effect after a successful logon to the client. Thus, normal dFW rules must be written for all communication that happens before the user logs on. This applies, for example, to Windows AD communication. The NSX Manager synchronises cyclically with the domain controllers. The default interval is 180 minutes. If changes are made to the group membership at short notice, a manual sync can be performed. Alternatively, the sync interval can also be shortened or extended. Using network introspection, the NSX Manager recognises when a user logs on to a client and can perform matching with the AD groups and thus add the client dynamically to the security group of the IDFW firewall rule.</p>
<figure><picture><source srcset="/idfw/00_idfw_hu_b9b82cb505a9ed72.jpg" type="image/jpeg">
          <img
            src="/idfw/00_idfw_hu_b9b82cb505a9ed72.jpg"alt="Idendity Firewall function"width="1782"
            height="1164"/>
        </picture><figcaption><p>Idendity Firewall function</p></figcaption></figure>
<h2 id="getting-started">Getting started</h2>
<p>Firstly, I start by customising the golden images of the VDI and installing NSX Guest Introspection. These are not installed by default and have to be installed explicitly. You can find them under VMware Device Driver – NSX Network Introspection. File Introspection is installed automatically.</p>
<figure><picture><source srcset="/idfw/01_vmwaretools_hu_88dbb65feab63900.jpeg" type="image/jpeg">
          <img
            src="/idfw/01_vmwaretools_hu_88dbb65feab63900.jpeg"alt="VMware Tools"width="555"
            height="417"/>
        </picture><figcaption><p>VMware Tools</p></figcaption></figure>
<p>Once Guestintrospection has been successfully installed, we no longer need to do anything on our Windows clients. Next, the domain must be integrated.</p>
<p>This is done in the NSX Manager under System -&gt; Identity Firewall AD. Several domains can be entered so that multi-tenant setups can also be realised. The NSX Manager requires firewall activations and must be able to reach the domain controllers via LDAP or LDAPS. I strongly recommend the use of LDAPS. These settings can also be used to perform a manual sync or check the synchronisation status.</p>
<figure><a href="02_nsx_idfw.jpeg"><picture><source srcset="/idfw/02_nsx_idfw_hu_1cd4b4c2f476db5.jpeg" type="image/jpeg">
          <img
            src="/idfw/02_nsx_idfw_hu_1cd4b4c2f476db5.jpeg"alt="Idetity firewall AD settings"width="1452"
            height="465"/>
        </picture></a><figcaption><p>Idetity firewall AD settings (click to enlarge)</p></figcaption></figure>
<p>Under LDAP Server you can set several domain controllers for the previously set up domain. The protocol used is also selected here. I use the Domain Administrator in my lab. In a productive environment, an LDAP bind user should always be used.</p>
<figure><a href="03.jpeg"><picture><source srcset="/idfw/03_hu_3f980145c1b8fca6.jpeg" type="image/jpeg">
          <img
            src="/idfw/03_hu_3f980145c1b8fca6.jpeg"alt="LDAP Server settings"width="1160"
            height="460"/>
        </picture></a><figcaption><p>LDAP Server settings (click to enlarge)</p></figcaption></figure>

    <aside class="admonition attention">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" class="feather feather-link" width="24" height="24" viewBox="0 0 24 24"
      fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
      <path d="M10 13a5 5 0 0 0 7.54.54l3-3a5 5 0 0 0-7.07-7.07l-1.72 1.71"></path>
      <path d="M14 11a5 5 0 0 0-7.54-.54l-3 3a5 5 0 0 0 7.07 7.07l1.71-1.71"></path>
   </svg></div><b>Attention</b>
        </div>
        <div class="admonition-content">If LDAPS is used, NSX imports the SHA thumbprint of the domain controller certificate. As the certificate is usually renewed automatically after 2 years at the latest, NSX loses the trust with the domain controller. In this case, the trust must be established manually. To do this, delete the thumbprint and reconnect to the bind user.  It has proven to be practical to monitor certificate expiry times and to enter at least 2 domain controllers that exchange their certificates with a 4-week time lag. If the trust fails completely, no more identity rules are applied and the default firewall rule comes into effect. In practice, this should be an any/any default drop and log rule.</div>
    </aside>
<p>After we have successfully setup and synchronised our domain, we only need to activate the Identity Firewall. By default, this feature is disabled (it is a free feature that is available with the NSX Firewall VCF add-on). To activate the Identity Firewall, go to Security -&gt; Distributed Firewall -&gt; Settings -&gt; Identity Firewall Settings and activate the identity firewall service button. Then we select the cluster on which we want to activate the service. Now we can get started.</p>
<figure><a href="04.webp"><picture><source srcset="/idfw/04_hu_3501ba0bfbdac83b.webp" type="image/webp">
          <img
            src="/idfw/04_hu_3501ba0bfbdac83b.webp"alt="Distributed Firewall Settings"width="1699"
            height="792"/>
        </picture></a><figcaption><p>Distributed Firewall Settings (click to enlarge)</p></figcaption></figure>
<h2 id="identity-firewall-rules">Identity Firewall Rules</h2>
<p>A general recommendation that applies to all firewalls is to think about a naming concept. At my customer, we have different name prefixes for the various security groups in NSX or in the other firewalls. For my part, I prefer the following naming convention, for example:</p>
<p>Distributed firewall groups start with dFW_XXX, an LDAP backed security group with dFWU_XXX (the U stands for user). For a group that contains an NSX segment it would be dFWS_XXX and for the gateway firewall a gWF_XXX and so on.</p>
<p>So we create our first LDAP user group. As with any group, we can do this either when creating the rules or in the inventory under Groups. The process is the same as for a normal Distributed Firewall Group, except that we don’t use tags but Distinguished Names, which can be conveniently selected or filtered from the synchronised AD elements.</p>
<figure><a href="05.webp"><picture><source srcset="/idfw/05_hu_bea79e3bbb4bd7e8.webp" type="image/webp">
          <img
            src="/idfw/05_hu_bea79e3bbb4bd7e8.webp"alt="Security Groups"width="1164"
            height="977"/>
        </picture></a><figcaption><p>Security Groups (click to enlarge)</p></figcaption></figure>
<p>I also need at least one segment group and at least one group containing my target servers. The target servers are assigned to their groups via tags. The same applies to the segment group. To do this, I set one or more tags on the overlay segment and use this tag as a condition for group membership.</p>
<figure><a href="06.webp"><picture><source srcset="/idfw/06_hu_336449cf75078643.webp" type="image/webp">
          <img
            src="/idfw/06_hu_336449cf75078643.webp"alt="Member selection"width="1156"
            height="964"/>
        </picture></a><figcaption><p>Member selection (click to enlarge)</p></figcaption></figure>
<p>Our goal will be that we authorise our users assigned to dFWU_UserGroup1 to access our fileserver with SMB and the users of the group dFWU_UserGroup2 must not receive any authorisation. I have two domain users in my lab, User1 is in the AD group assigned to dFWU_UserGroup1 and User2 is only assigned to dFWU_UserGroup2.</p>
<h2 id="creating-the-firewall-rules">Creating the firewall rules</h2>
<p>For each identity firewall rule that allows traffic from a group of users to a destination, there must be a corresponding distributed firewall rule that allows traffic from a group of computers to the same destination specified in the identity firewall rule. We therefore need two firewall rules.</p>
<figure><a href="07.webp"><picture><source srcset="/idfw/07_hu_248d0614bcc88974.webp" type="image/webp">
          <img
            src="/idfw/07_hu_248d0614bcc88974.webp"alt="Firewall Rules"width="1630"
            height="226"/>
        </picture></a><figcaption><p>Firewall Rules (click to enlarge)</p></figcaption></figure>
<p>The first rule is pretty straight forward, as source we have our dFWU_UserGroup1, the target is our dFG_Fileserver and the service is SMB. The Applied To Field is even more important than usual for the Identity Firewall. We may only apply this rule to our VDIs. Since my customer has different pools that are named according to a specific naming scheme, I can further restrict the scope based on the computer name. Each pool has different rules and we only want the rules to be realised on VMs where they are needed. The second rule is a bit more interesting. As a source, we have our VDI segment or segments. As in the first rule, the target is our file server. Logically, the service is also the same.</p>

    <aside class="admonition attention">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" class="feather feather-link" width="24" height="24" viewBox="0 0 24 24"
      fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
      <path d="M10 13a5 5 0 0 0 7.54.54l3-3a5 5 0 0 0-7.07-7.07l-1.72 1.71"></path>
      <path d="M14 11a5 5 0 0 0-7.54-.54l-3 3a5 5 0 0 0 7.07 7.07l1.71-1.71"></path>
   </svg></div><b>Attention</b>
        </div>
        <div class="admonition-content">It is important that the Apply To field can only be on the file servers or on the target that we want to enable. If we were to use the dFG_VDI_GroupA or the dFGS_VDI group here, for example, then the entire Identity Firewall Rule is cancelled out!</div>
    </aside>
<h2 id="testing-and-verifying">Testing and verifying</h2>
<p>The test is considered successful if User1 on the TestVDI can establish a successful SMB connection to the file server. If User2 is used instead of User1, the traffic to the file server must be blocked by the firewall.</p>
<p>For testing, I log in to my VDI with the credentials of User1 and perform a TestNetConnection with Powershell. This is a simple and quick way to test TCP connections. I also open a share on the file server.</p>
<figure><a href="08.webp"><picture><source srcset="/idfw/08_hu_7b077c01d487e9a5.webp" type="image/webp">
          <img
            src="/idfw/08_hu_7b077c01d487e9a5.webp"alt="Network test user 1"width="1072"
            height="627"/>
        </picture></a><figcaption><p>Network test user 1 (click to enlarge)</p></figcaption></figure>
<p>The test was successful, both the TNC command and the actual opening of the file share worked. Now I’m running the same test on the same VDI (after it was recreated, because non-persistent VDI), only this time I’m using User2, which has no explicit firewall rules and is therefore blocked by my default cleanup rule. As expected, the traffic was successfully blocked.</p>
<figure><a href="09.png"><picture><source srcset="/idfw/09_hu_af37365be7ab176f.png" type="image/png">
          <img
            src="/idfw/09_hu_af37365be7ab176f.png"alt="Network test user 2"width="993"
            height="499"/>
        </picture></a><figcaption><p>Network test user 2 (click to enlarge)</p></figcaption></figure>
<h2 id="lessons-learned">Lessons learned</h2>
<p>This is where I would like to add a few more thoughts on the subject. Troubleshooting is more difficult in practice than I thought. Tools such as NSX Traceflow cannot be used because you cannot add an AD user to the request. This means that the traffic in the traceflow is always dropped or the identity rule is maybe configured incorrectly.</p>
<p>But there is light at the end of the tunnel. In NSX 4.X there is a session view of the active IDFW user session under Security -&gt; Security Overview -&gt; Configuration. All active sessions, UserIDs and VMs are displayed here, as well as the source of the information.</p>
<figure><a href="10.png"><picture><source srcset="/idfw/10_hu_6e27531d51e5c107.png" type="image/png">
          <img
            src="/idfw/10_hu_6e27531d51e5c107.png"alt="Active Sessions"width="979"
            height="278"/>
        </picture></a><figcaption><p>Active Sessions (click to enlarge)</p></figcaption></figure>
<p>Next tip would be to always check the sync status with the AD. Ask your AD admin when the user was added to the group. If the user has several accounts, ask for the user name used. Experience has shown that this is where most problems occur.</p>
<p>Use a syslog server and check exactly with which rule ID the traffic was discarded. Have all deny rules logged.</p>
<p>Not all rules can be implemented as Identity Firewall Rules. The Windows domain basic communication can only be enabled via a classic set of rules, as no Identity Firewall rules are active for the VM without an active user session.</p>
<h2 id="important-things-to-know">Important things to know</h2>
<p>Never install Guest Introspection on a target. If a user has remote desktop permissions on the target and guest introspection is active there, then the target receives all of the user’s firewall rules. This can lead to unwanted firewall permissions.</p>
<p>If targets outside of NSX are addressed, such as a NAS or legacy infrastructure, a second rule is not required (unless the gateway firewall is also used). In this case, the distributed firewall will only check the traffic at the source VM.</p>
<p>Any change on a domain, including a domain name change, will trigger a full sync with Active Directory. Because a full sync can take a long time, i recommend syncing during off-peak or non-business hours.</p>
<p>MutiUser setups only work with RDSH (Remote Desktop Session Hosts) which requires a special configuration. Otherwise, if several users are logged on to a client at the same time, this leads to unwanted behavior and, in the worst case, to unwanted firewall permissions.</p>
<h2 id="conclusion">Conclusion</h2>
<p>The Identity Firewall is a wonderful extension of the Distributed firewall and should be treated as such. Used correctly, it provides a very nice way to manage and delegate firewall permissions dynamically and centrally for individual users or user groups. It enables generic VDIs that can be used for different purposes depending on the user. This can reduce the number of VDI pools required, which in turn makes it easier to manage the customers VDI environment. The RBAC concept is even more tightly bound to the firewall policies and AD tiering can also be enforced via the firewall. And best of all, this great feature is included in the NSX Firewall license. I would recommend every NSX firewall administrator to take a closer look at the Identity Firewall.</p>
<h2 id="additional-resources">Additional resources</h2>
<p><a href="https://docs.vmware.com/en/VMware-NSX/4.1/administration/GUID-9CD3FC21-9ED4-4FB3-9E19-67A7C4D1F53E.html">VMware Docs Idendity Firewall</a></p>
]]></content>
		</item>
		
		<item>
			<title>NSX 4.X Certificate exchange of the NSX Manager</title>
			<link>https://sdn-warrior.org/posts/nsx-cert-exchange/</link>
			<pubDate>Fri, 05 Apr 2024 23:22:00 +0100</pubDate>
			
			<guid>https://sdn-warrior.org/posts/nsx-cert-exchange/</guid>
			<description><![CDATA[Exchange your NSX Manager certificates]]></description>
			<content type="html"><![CDATA[<h1 id="nsx-4x-certificate-exchange-of-the-nsx-manager">NSX 4.X Certificate exchange of the NSX Manager</h1>
<h1 id="certificate-creation">Certificate creation</h1>
<p>First of all, we need a CSR request. This can be created with OPENSSL. It is important that the key is also exported. You can either create 4 individual certificates (VIP and the three manager nodes) or a SAN certificate with all DNS and IP names of the manager nodes. The easiest way is to carry out the request on a manager node. To do this, I create an openssl config file with VIM.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">[req]
</span></span><span class="line"><span class="cl">default_bits = 4096
</span></span><span class="line"><span class="cl">default_md = sha256
</span></span><span class="line"><span class="cl">days = 365
</span></span><span class="line"><span class="cl">distinguished_name = req_distinguished_name
</span></span><span class="line"><span class="cl">req_extensions = v3_req
</span></span><span class="line"><span class="cl">prompt = no
</span></span><span class="line"><span class="cl"> 
</span></span><span class="line"><span class="cl">[req_distinguished_name]
</span></span><span class="line"><span class="cl">C   = DE
</span></span><span class="line"><span class="cl">ST  = RLP
</span></span><span class="line"><span class="cl">L   = NW
</span></span><span class="line"><span class="cl">O   = Land RLP
</span></span><span class="line"><span class="cl">OU  = sdnwarrior
</span></span><span class="line"><span class="cl">CN  = nsxm0001.lab.home
</span></span><span class="line"><span class="cl">emailAddress = mail@lab.home
</span></span><span class="line"><span class="cl"> 
</span></span><span class="line"><span class="cl">[v3_req]
</span></span><span class="line"><span class="cl">subjectAltName = @sans
</span></span><span class="line"><span class="cl"> 
</span></span><span class="line"><span class="cl">[sans]
</span></span><span class="line"><span class="cl">DNS.1 = nsxm0001.lab.home
</span></span><span class="line"><span class="cl">DNS.2 = nsxm0002.lab.home
</span></span><span class="line"><span class="cl">DNS.3 = nsxm0003.lab.home
</span></span><span class="line"><span class="cl">DNS.4 = nsxm0004.lab.home
</span></span><span class="line"><span class="cl">IP.1 = 192.168.12.110
</span></span><span class="line"><span class="cl">IP.2 = 192.168.12.111
</span></span><span class="line"><span class="cl">IP.3 = 192.168.12.112
</span></span><span class="line"><span class="cl">IP.4 = 192.168.12.113
</span></span></code></pre></div><p>The CSR is generated with the following command:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">openssl req -new -newkey rsa:4096 -nodes -keyout nsxm0001.key -out nsxm0001.csr -config opnssl.cnf
</span></span></code></pre></div><p>Two files are generated, a private key file and the actual request, which must be submitted to the CA.</p>

    <aside class="admonition attention">
        <div class="admonition-title">
            <div class="icon"><svg xmlns="http://www.w3.org/2000/svg" class="feather feather-link" width="24" height="24" viewBox="0 0 24 24"
      fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
      <path d="M10 13a5 5 0 0 0 7.54.54l3-3a5 5 0 0 0-7.07-7.07l-1.72 1.71"></path>
      <path d="M14 11a5 5 0 0 0-7.54-.54l-3 3a5 5 0 0 0 7.07 7.07l1.71-1.71"></path>
   </svg></div><b>Attention</b>
        </div>
        <div class="admonition-content">The CA must issue the certificate with the extension basicConstraints = cA:FALSE, otherwise the certificate cannot be used. With a Windows CA, this must be explicitly permitted in the template. If the extension is missing, the certificate validation will fail with an error message that the certificate key does not match the certificate.</div>
    </aside>
<h2 id="import-certificate">Import certificate</h2>
<p>The certificate can be imported in the NSX Manager under System &gt; Certificates &gt; Import. Here it must be ensured that the service certificate slider is set to NO. The complete certificate chain is also required. The certificate chain must be in the industry standard order of ‘certificate – intermediate – root.</p>
<figure><picture><source srcset="/nsx-cert/01_hu_8b9c0ce65c7f3013.webp" type="image/webp">
          <img
            src="/nsx-cert/01_hu_8b9c0ce65c7f3013.webp"alt="NSX Cert"width="582"
            height="924"/>
        </picture><figcaption><p>Import NSX Cert</p></figcaption></figure>
<p>After the import, the certificate can be validated using an API request.
API calls may vary depending on the NSX-T versions, in my example NSX version 4.1.2.3 is used.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">GET https://&lt;nsx-mgr&gt;/api/v1/trust- management/certificates/&lt;cert-id&gt;?action=validate
</span></span></code></pre></div><h2 id="exchange-of-certificates">Exchange of certificates</h2>
<p>An API request must be executed for each manager node and for the VIP. This requires the certificate ID and the manager node ID. Both can be copied from the WebGUI or requested via API Get Requests.</p>
<p>The following API call is used to exchange the Manager Node certificate:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">POST /api/v1/trust-management/certificates/&lt;cert- id&gt;?action=apply_certificate&amp;service_type=API&amp;node_id=&lt;node- id&gt;
</span></span></code></pre></div><p>The following API call is used to exchange the cluster VIP certificate:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">POST /api/v1/trust-management/certificates/&lt;cert- id&gt;?action=apply_certificate&amp;service_type=MGMT_CLUSTER
</span></span></code></pre></div><p>After replacing the certificates, you should close all browser windows and log in to the NSX Manager again. The certificate should now have been successfully replaced.</p>
<h2 id="further-resources">Further resources:</h2>
<p><a href="https://docs.vmware.com/en/VMware-NSX-T-Data-Center/3.2/administration/GUID-50C36862-A29D-48FA-8CE7-697E64E10E37.html">VMware Administration Handbook</a></p>
]]></content>
		</item>
		
	</channel>
</rss>
