Help with a Windows SMB network error and corrupted file transfers

Hey all! Like header says I’m having some bizarre, sporadic, yet quite frustrating and intrusive issues with my newish TrueNAS setup. I’m new here and new to TrueNAS so apologies if I break or misunderstand conventions here. I can’t find anyone else on the internet with this problem, so I figured I would see if anyone here can help. This issue is pretty weird and I’ve done a bunch of my own testing, so there’s a lot to say, I’ll give as much detail as I can.

Preemptive TL;DR: Random Windows errors on batch file transfers, seems related to periodic speed drops to 0B/s but doesn’t do it every time, error code 0x8007003b, says it can’t transfer file, except the file exists but is corrupted somehow and is a major pain in the neck to delete and/or replace. About a 1-in-2 chance this happens during a batch [50-500] file transfer and then it’s ages of trying delete the corrupted file and start over. Basic troubleshooting didn’t work. Disabled windows firewall for private network and it didn’t help. Laptop connected to same SMB share didn’t help. No red flag apparent in Task Manager or TrueNAS dashboard. Not even sure what to tinker with next.

My Setup

Main Rig:

  • Ryzen 5900X
  • Asus TUF Gaming X570 Plus [Mark 1 version, I think they made an updated version] [No onboard Wi-Fi]
  • Onboard 1Gb/s [See below for routing]
  • 32GB Ram
  • NVIDIA 3060 12GB
  • PCIe Wi-Fi/Bluetooth card [only for Wi-Fi, NOT connected to NAS network]
  • Windows 11 23H2 [24H2 apparently hasn’t rolled around to me yet, “Check for Updates” says “up to date”]

NAS

  • Ryzen 5500GT w/ Radeon Graphics
  • Asus Prime B550 Plus
  • 32GB Ram
  • TrueNAS CORE -13.0-U6.7 [Problem began on 13.0-U6.2, updated manually to U6.7 but problem persists]
  • 1 ZFS Pool set up with SMB share for my editing rig, strictly following the TrueNAS guide. No plugins, VMs, or anything other than the storage pool and SMB configuration.

Network Routing:

Dedicated TP Link gigabit switch [ER605 V2], no internet connection for NAS, different link for my Main rig.

NAS Onboard Ethernet ↔ TP Link Router ↔ Main Rig Onboard Ethernet -|- PCIe Wi-Fi/Bluetooth Card ↔ Home Router ↔ Web

Background

I’m a freelance event videographer/video editor, so my job is mostly shooting and editing weddings. I’m reasonably techy [AKA I watch a lot of LTT and I’m the primary tech volunteer for church that’s almost big enough to hire me to do it], but I’m not “tinker with Linux” kinda tech savvy, and networking that goes beyond basic stuff is probably too much for me.

I’ve been working from home on my main rig forever. It’s built like a gaming machine except that it’s packed full of drives, one SATA SSD for my OS and two NVMe drives for editing, plus several high capacity hard drives managed by windows. I had finally filled up enough hard drives that I was sick of having them in my main rig so I finally built a NAS maybe a month ago and put TrueNAS Core on it.

Now I’m finally getting around to moving about 20+ TB of footage, project files, photos, etc. out of my main machine and into my NAS so I can finally pull those drives out of my windows machine. And that’s when I started running into problems:

The Problem

Periodically, but with no rhyme or reason I can discern, my transfer speeds drop from the gigabit saturated ~100-110 MB/s speeds all the way to 0B/s, and then shoot back up. That’s not too much of an issue for me, except when the dropout is significant enough that it throws up an error code [0x8007003b, which as far as I can tell is a 10 digit way to say “idk check you your network or something lol”, courtesy of Microsoft] and asks me what to do [Try again/Skip/Cancel]. That makes it hard to transfer overnight, and it’s causing some other problems too, namely that whatever file it is that is getting transferred when it gets interrupted basically becomes a parasite after that error comes up. Here’s an example:

If I transfer three clips–

Clip A.mp4 - 734 MB
Clip B.mp4 - 241 MB
Clip C.mp4 - 401 MB

–at the same time, and it throws up the error during Clip B, it doesn’t matter which of the three options I choose, Clip B will show up with the exact same file size as the original. So if I hit “Try Again”, it will actually upload Clip C next, and anything else in the list [potentially failing again on some other files along the way], and then at the end it will give me another dialogue box that tells me that Clip B already exists. Replacing is not an option [you’ll see why] so if I have it save a new copy instead, I’m left with:

Clip A.mp4 - 734 MB
Clip B.mp4 - 241 MB
Clip C.mp4 - 401 MB
Clip B(1).mp4 - 241 MB

And the biggest problem is that Clip B is now super corrupted or something. Trying to do anything with it is a nightmare. It won’t open in whatever player you have [I’ve had it happen to both MP4 and BRAW clips, so VLC and Blackmagic Raw Player have either frozen, crashed, or just not worked trying to open them respectively], right-clicking the file causes Explorer to lag for a few seconds, and clicking delete causes it to hang for like 2 minutes, after which it will finally give me the dialogue box to confirm deletion, then it gives me the file transfer graph and proceeds to do nothing [except file explorer will fail to respond once or twice during] and then, sometimes after literally 20+ minutes, it will throw up another error, now saying that the file can’t be deleted because it’s in use by another program.

That’s a big issue because I need my project files to point back to the same filename. I can’t have a broken Clip B reference interfere with a project file if I need to recall a project. I need to rename “Clip B(1).mp4” to exactly “Clip B.mp4” again, and that means I need the corrupted one with the original name to be gone. Besides, I certainly don’t want to store duplicates of files. Some of these are 100GB+ RAW files.

Troubleshooting so far

  • Task Manager shows no obvious red flags for what could be using the file. I’ve done this without trying to open it, so it’s not BRAW Player or VLC. Could it be some under-the-hood process of Explorer itself? Windows Defender? Heck if I know.

  • Same with the stats in the TrueNAS GUI; Ram usage is high compared to what I’m used to on windows but not full, and most of it is ZFS cache, not system services; CPU is barely doing anything; pool is only at 19% capacity; and the network transfer rate is basically nothing while trying to delete.

  • Restarting Explorer from TM frees it up from being frozen and laggy, but if I try again I get the same results.

  • Temporarily shutting off SMB from ui/sharing/SMB/edit from TrueNAS browser UI doesn’t stop the problem from happening either, even though I assumed that would force Windows to re-initiate with the NAS without a full restart.

  • Restarting my main rig allowed me to delete the file normally in 3/4 cases, no idea what was different that 4th time.

  • Restarting my NAS also allowed the file to be deleted normally in 1/1 cases.

  • Hooking up my Laptop [Asus ROG Zephyrus G15, Ryzen 5800H, onboard networking] to the same router [different cable] and transferring the files didn’t prevent disconnections from happening. When a disconnection happened during the test transfers from my laptop, the TrueNAS GUI Dashboard stats would hang up on my Main Rig. In one out of three cases, it actually booted the GUI back to the login page. I think this might narrow the dropouts down to router and/or NAS configuration and/or onboard Ethernet on the NAS end. The weirdness around these corrupted files after a connection has been reestablished still confuses me though.

  • Turning off Wi-Fi on both Main Rig and Laptop to rule out Windows weirdness about prioritizing one network. Even with only Ethernet connected, dropouts still occurred on both systems.

  • Having a file fail to transfer from one system and then deleting it from the other via SMB worked as intended, it seems deleting/altering is only a problem from the same system that failed to transfer the file.

  • Multiple forum posts that had the same windows error code reported that the fix was to disable Windows Defender’s Firewall. I disabled it for private networks [and verified that my Ethernet connection to my TP-Link router was set to private in Windows] but to no avail, I still suffered dropouts. Again, the fact that the GUI hangs on a separate machine when a file transfer dropout occurs makes me feel like it can’t be the transferring system that’s the issue.

  • I use Surfshark VPN on most of my devices. I wasn’t connected through it, but I eventually uninstalled it from my laptop entirely, and after a restart, still suffered dropouts that led to a corrupted file. Not it.

  • Discovered during testing that files report their size to File Explorer at their full size the moment they begin to transfer, regardless of whether or not they have finished transferring or not. If I start to transfer a single 28GB file on my laptop, it immediately looks like a full 28GB file on explorer even on the other system. This is unlike what I’m used to in Windows, with .part files and stuff. Is this a ZFS characteristic? It makes it more difficult to identify files that failed to transfer properly, I don’t like this behavior.

Conclusion and plea

Basically I’m completely lost. I have no idea what I can even test now. I’ve spent a day and a half on this and gotten nowhere. I don’t normally ask for help on forums because I like to figure things out on my own and benefit from the experience, but I have real work to do and eager newlyweds to please, and I don’t know what else to do beyond what I’ve already done.

It’s not like I have a second NAS I can test, nor can I even test on my home router [it’s a Starlink because I can only get DSL to my house, and it has literally no accessible wired connections and it’s two floors below my other systems]. So are there some wizards out there that can tell me that I just configured something wrong? Do I just need to order some dedicated NICs or something? Any help or ideas are appreciated.

I either need to:

  • Fix the dropouts issue and thereby prevent the corruption issue from ever happening

Or at least

  • Find out how to deal with the corruption issue without having to hook in a different system or restart the Main Rig or NAS.

Does anybody have any ideas?

It sounds like an issue with your NAS’s onboard RealTek NIC.


If you were running SCALE, I would suspect a bug in SMB, since this has affected others before. It’s not out of the realm of possibility that an SMB bug might exist on Core 13.0-U6.7.

It’s a budget board, is that just a cheap NIC? Should I order a better one and wait the week for it to get here? I assume TrueNAS can take a PCIe one right?

Some networking posts to view
NETWORKING

1 Like

It can take whatever FreeBSD 13.x (Core) or Debian (SCALE) supports.

1 Like

I really doubt you’ll get better results if you upgrade to TrueNAS Core 13.3-U1 (FreeBSD 13.3), since even FreeBSD 13.0 supports the RealTek 8111-based chipset.

This sounds more like a hardware problem than a “driver isn’t the latest version” problem.

1 Like

Well I just read up on what SmallBarky supplied and it seems the realtek hate is pretty intense around here. Since that seems like a common culprit I’ll use those resources and hold off on moving my archive for a bit while I wait for one to come in. I’ll be sure to update this thread if that fixes it. Thanks folks, I’m hoping it’s that simple.

More like poor performance and corruption hate. :wink:

Not just RealTek, but also fake “Intel” cards, which has bitten another user before.


No spare NIC somewhere? Not even in another PC?

You can also double-check the connection or try a different ethernet cable on the NAS’s LAN port.

Sorry that you have to go through this. If it is a hardware issue, then the same thing would manifest with any other software or NAS.

:warning: Understand the ZFS does not protect you from this type of corruption.

Only a crappy USB chinesium one. Syncwire, weighs nothing. It’s got blue in the USB, although I’ve been lied to about USB 3 by stuff like this before. No model number so no way to know what’s in it. It definitely won’t be my permanent solution but I might as well give it a whirl tonight to see if we can confirm that realtek chip on the motherboard is the culprit.

1 Like

Yeah no dice on the USB NIC, couldn’t even get it working within a few minutes, I think it’s almost certainly more trouble than it’s worth. I just nabbed a cheap ebay listing, $14 for a supermicro card with an Intel i350 and two gigabit rj45s. Hopefully exactly what I need. Shopping for a gigabit intel non-knockoff card kinda sucks right now.

Anyways, consider this topic on hold until I can get that card into my NAS. I will update as soon as I can get it in, working, and tested, and hopefully that’s the end of this saga. Thanks again for your help folks.

So you have corrupted files when copying from TrueNAS to your Windows computer. How were you able to copy files from your Windows computer to TrueNAS? The same way and it worked fine?

As for copying files in Windows, yes it will show the full size and make the copy without the .part extension, that last one is when you download in a browser in your Downloads folder.

I fail to understand the setup here : NAS Onboard Ethernet ↔ TP Link Router ↔ Main Rig Onboard Ethernet -|- PCIe Wi-Fi/Bluetooth Card ↔ Home Router ↔ Web

I don’t understand the post, have you tried both wireless and also wired? I see you have tried on two different computers so we can rule out a Windows bug or your computers.

I doubt it’s the Realtek NIC card because it’s still functional and you were able to copy data to TrueNAS.

Your test is with files from 400 to 800 MB, what about smaller files?

Hey man, I appreciate you lending your help, but I’m not really sure how to answer because all of the information you just asked for is in the original post.

  1. Copying from Windows to TrueNAS is the problem.
  2. Yes, that’s my setup. My main rig is connected to two networks but I ruled out connection to the home router as the problem by isolating the local network which connects it to my NAS.
  3. I’m not hooking up a wireless router just to test this issue, nor am I going to transmit files that are hundreds of gigabytes over the air on the regular.
  4. It’s not unreasonable that it’s the network card given all the information provided by the other people who have supplied resources in this thread. One of the articles I read on the matter explained how on-board NICs are not designed to be saturated for long periods of time like a server setting demands, and I don’t find that explanation unsatisfactory. This is the current prevailing theory. I have the NIC for this now, and the process is being held up by the fact that one of my drives just threw up a bunch of SMART errors at less than 1000 hours. My ZRAID configuration means this isn’t really an issue but I’ve paused using my NAS until my replacement arrives, as I didn’t have a spare kicking around.
  5. My example was with 400-800MB files; as stated elsewhere in my post, some of these files are humongous 100GB+ RAW files from my cinema cameras. I haven’t collected any meaningful data on whether or not the corruption is more/less likely on bigger/smaller files.

I know you don’t want to test with wireless but it would be a good test, the results might be quite interesting :slight_smile:

The SMART errors are interesting too, depends on what type of errors but it’s a possibility the drives is so corrupted that it causes the issue you have. Will be interesting with a new drive which is fully functional.

I wouldn’t worry about copying multiples 100 GB+ files, won’t saturate, I do it regularly on a lot of multiple different computers with multiple different NIC over many years, never had an issue.