Sometimes Files stuck while copy

backpulver · September 19, 2025, 11:54am

Hey,

I’m running into a strange issue:
I have a TrueNAS system with two HDDs Storage, and normally I can copy or move files from my client (Tuxedo Linux on the same network) to the storage over SMB without any problems.

With some specific files, however, the behavior is odd:
The transfer stops somewhere between ~60–80%, and after a while I get the error “can’t write file XY”. A .part file is left behind, but I can’t delete it because it says “device is busy”.
This happens whether I copy just a single file or multiple files in parallel. All other files transfer perfectly fine.
Mostly i copy backups of our own hosted cloud (pictures, backups of immich (tar), some videos of family)

Using scp works for these files, but that’s less convenient.
There’s nothing in the logs related to this error. HDD and Sysetem usage is very low, and there’s plenty of free space available.

Informations:
Typ: FILESYSTEM
Sync: STANDARD
Compression Level: Inherit (LZ4)
Enable Atime: OFF
ZFS Deduplikation: OFF
Groß- und Kleinschreibung: OFF

Do you have any idea how I could fix this properly?

Thanks

backpulver · September 25, 2025, 10:21am

Nobody?

Arwen · September 25, 2025, 3:15pm

Not really.

Since SCP seems to work, that somewhat rules out a network problem. Perhaps the mount parameters on the Tuxedo Linux need tweeking to get reliable SMB transfers. However, I personally don’t have the SMB experience to either know what may help, or even if any changes could help.

PhilD13 · September 25, 2025, 4:55pm

Without knowing a lot more about the two systems and the file sizes, then in general freezing during file copies on a Linux system could be due to how the kernel handles write caching. With large files or to slower devices like USB drives, a large amount of unwritten data could accumulate in the cache, and the system may pause the copy or become unresponsive while it flushes the cache data to disk. There could also be other factors that contribute or may cause problems too like an SMR drive in the system.

winnielinnie · September 25, 2025, 5:41pm

Which application is creating .part files during a transfer?

How are you mounting the SMB shares? Through fstab / systemd or through the default file manager’s “Network” feature? You want to use the proper method with the cifs module, not GVFS or KIO.

backpulver · September 25, 2025, 5:48pm

I mounted them through the file manager dolphin over “network search” and added them in my locations.

how i can see which module is in use?

Thanks

winnielinnie · September 25, 2025, 5:50pm

Don’t do that. Use the standard method of mounting an SMB share via the cifs module. This can be done with fstab.

Add an entry for the share in your /etc/fstab, and include the _netdev and x-systemd.automount parameters.

winnielinnie · September 25, 2025, 6:01pm

Here is what an /etc/fstab entry looks like:

//192.168.0.101/myshare     /mnt/truenas/myshare     cifs     iocharset=_netdev,x-systemd.automount,noauto,utf8,rw,actimeo=60,cache=loose,uid=winnie,gid=winnie,forceuid,forcegid,file_mode=0660,dir_mode=0770,credentials=/home/winnie/.smbcreds     0     0

The file /home/winnie/.smbcreds should contain the username and password for the TrueNAS user.

username=winnienas
password=mypassword1234

Protect it with chmod 400.

You can add the location /mnt/truenas/myshare to your Favorites, so you can conveniently access the SMB share.

PhilD13 · September 25, 2025, 6:13pm

Okay now I am more interested.
@backpulver how big is the file (and type) that hangs? Are you copying from linux to the server of from the server? I use Fedora and Dolphin and I believe connect in the same way as you are connecting and have not run into the issue. I may be able to find and try a file about the same size and type and give it a try.

backpulver · September 25, 2025, 7:00pm

thats a long line. i tried and get a parse error.

backpulver · September 25, 2025, 7:10pm

The Files between 3-10gb (backup files, zipped files, some isos). they stop every time exactly between 440 and 500mb. on scp sometimes the message “stalled” is in the cli with same behavior.

winnielinnie · September 25, 2025, 8:31pm

You have to make your own fstab entry. I provided that as a sample. I fixed a mistake in the options.

//192.168.0.101/myshare     /mnt/truenas/myshare     cifs     _netdev,x-systemd.automount,noauto,rw,iocharset=utf8,actimeo=60,cache=loose,uid=winnie,gid=winnie,forceuid,forcegid,file_mode=0660,dir_mode=0770,credentials=/home/winnie/.smbcreds     0     0

winnielinnie · September 25, 2025, 8:32pm

A memtest wouldn’t hurt just to rule it out, since you seem to see this more commonly with large files.

EDIT: You still should avoid GVFS/KIO and instead use the cifs module to access SMB shares.

PhilD13 · September 25, 2025, 9:11pm

I copied a folder with 10 iso files equal to 13 GB total back and forth between the laptop and the server using Dolphin setup (right or wrong) the way I have been using it without issue. My servers are setup under Network in dolphin with a path like smb://myserver.local/ and show a regular name. There was some setup screen in dolphin that helped me. I think it appeared when I entered smb:// or selected that as what I wanted to connect to.

The fstab file on my system has a note in it that accessible filesystems are elsewhere and to run a specific command which they give to update system if fstab is changed.
Meddling with something that works okay would be further than I am willing to go on a working system. I don/t think anything will break if added correctly to the fstab then the update command run, but it is my only main system.

/etc/fstab
# Created by anaconda on Sun Dec 29 14:25:04 2024
#
# Accessible filesystems, by reference, are maintained under '/dev/disk/'.
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info.
#
# After editing this file, run 'systemctl daemon-reload' to update systemd
# units generated from this file.

I don’t know if any of this helped any and if not, sorry I couldn’t help

backpulver · September 26, 2025, 4:36am

Hey,

i did. here is my entry, updated from yours:

//10.0.0.17/Backup /home/backpulver/remote/s-1/Backup cifs _netdev,x-systemd.automount,noauto,rw,iocharset=utf8,actimeo=60,cache=loose,uid=backpulver,gid=backpulver,forceuid,forcegid,file_mode=0660,dir_mode=0770,credentials=/home/backpulver/.smbcredentials 0 0

This is the error while try mounting:

mount: /home/backpulver/remote/s-1/Backup: failed to parse mount options 'rw,_netdev,x-systemd.automount,noauto,iocharset=utf8,actimeo=60,cache=loose,uid=winnie,gid=winnie,forceuid,forcegid,file_mode=0660,dir_mode=0770,credentials=/home/backpulver/.smbcredentials': Das Argument ist ungültig.

backpulver · September 26, 2025, 4:38am

Iam also very confused, cause ~1 month ago all ran fine. this Problem is completely new.

backpulver · September 26, 2025, 5:01am

So after some changes i was able to mount the drive to my folder.

mount -l:

//10.0.0.17/Backup on /home/backpulver/remote/s-1/Backup type cifs (rw,relatime,vers=3.0,cache=strict,upcall_target=app,username=backpulver,uid=1000,forceuid,gid=1001,forcegid,addr=10.0.0.17,file_mode=0755,dir_mode=0755,soft,nounix,serverino,mapposix,reparse=nfs,nativesocket,symlink=native,rsize=4194304,wsize=4194304,bsize=1048576,retrans=1,echo_interval=60,actimeo=1,closetimeo=1)

the copyjobs now run without stuck, but very strange. After every File there is a big stop, also visible in network monitor. No fluent copy behavior, like before this problem starts.
xhttps://ibb.co/8LDrMKZT
(iam not allowed to upload images or post links. Here is the Screenshot -.-)

Any idea what this can be?

Thanks

bacon · September 26, 2025, 6:46am

Try and make sure that your ZFS pool is all healthy. Check that you have enough space (>80%). Check pool status (zpool status).

To make sure that it isn’t your storage pool that is causing performance issue, do some benchmarks directly on the storage device. That way you can exclude influence from network/smb from the test results.

Try to do a simple sequential write test. For example this will write a 16GB file sequentially (similar to how a SMB file copy would be):

$ fio --name=seqwrite --filename=/mnt/pool/dataset/testfile --size=16G --bs=1M --rw=write --ioengine=sync --direct=1 --numjobs=1

Change the /mnt/pool/dataset/testfile to point to a non-existing file within the dataset you wish to test. Don’t point it to an existing file.

To get more insight, watch the zfs io statistics while doing the fio test. For example run zpool iostat -wy 5 to get information on latency.

backpulver · September 26, 2025, 7:15am

sry i cant write today…

Zpool:

  pool: Backup
 state: ONLINE
  scan: scrub repaired 0B in 14:17:01 with 0 errors on Sun Sep 14 14:17:07 2025
config:

        NAME                                    STATE     READ WRITE CKSUM
        Backup                              ONLINE       0     0     0
          23f54caa-d0df-4c3f-8b4d-8e66ccf16358  ONLINE       0     0     0
          6fe55b3b-ef35-4f1b-bfdf-14659c19aeae  ONLINE       0     0     0

errors: No known data errors

Space:

Multimedia/Backup                                        22T   17T  4.8T  78% /mnt/Multimedia/Backup

Test:

root@s-1[~]# fio --name=seqwrite --filename=/mnt/Multimedia/Backup/test --size=16G --bs=1M --rw=write --ioengine=sync --direct=1 --numjobs=1 
seqwrite: (g=0): rw=write, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=sync, iodepth=1
fio-3.33
Starting 1 process
seqwrite: Laying out IO file (1 file / 16384MiB)
Jobs: 1 (f=1): [W(1)][100.0%][w=44.0MiB/s][w=44 IOPS][eta 00m:00s]
seqwrite: (groupid=0, jobs=1): err= 0: pid=322334: Fri Sep 26 09:07:55 2025
  write: IOPS=40, BW=40.0MiB/s (42.0MB/s)(16.0GiB/409228msec); 0 zone resets
    clat (msec): min=7, max=126, avg=24.93, stdev=11.63
     lat (msec): min=7, max=126, avg=24.97, stdev=11.63
    clat percentiles (usec):
     |  1.00th=[ 9765],  5.00th=[12125], 10.00th=[13829], 20.00th=[16057],
     | 30.00th=[17957], 40.00th=[19530], 50.00th=[21365], 60.00th=[23987],
     | 70.00th=[27395], 80.00th=[32113], 90.00th=[41681], 95.00th=[50594],
     | 99.00th=[61080], 99.50th=[64750], 99.90th=[77071], 99.95th=[82314],
     | 99.99th=[96994]
   bw (  KiB/s): min=26624, max=55296, per=100.00%, avg=41014.98, stdev=4858.44, samples=818
   iops        : min=   26, max=   54, avg=40.04, stdev= 4.74, samples=818
  lat (msec)   : 10=1.27%, 20=41.33%, 50=52.24%, 100=5.15%, 250=0.01%
  cpu          : usr=0.25%, sys=1.51%, ctx=20293, majf=5, minf=11
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,16384,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=40.0MiB/s (42.0MB/s), 40.0MiB/s-40.0MiB/s (42.0MB/s-42.0MB/s), io=16.0GiB (17.2GB), run=409228-409228msec

~40mib/s is very slow i see. i will check why this is so slow. On all other Storagepools are much more Speed.

I also tried the copyjobs for test on another storage with more drives

  pool: Home
 state: ONLINE
  scan: scrub repaired 0B in 2 days 00:40:39 with 0 errors on Tue Sep 16 16:40:53 2025
remove: Removal of vdev 1 copied 2.60T in 7h4m, completed on Fri Sep 19 18:12:50 2025
        14.5M memory used for removed device mappings
config:

        NAME                                    STATE     READ WRITE CKSUM
        Home                                   ONLINE       0     0     0
          d13f44e2-799b-43c2-a0c1-c258c2e68765  ONLINE       0     0     0
          0268e2d4-5ef1-45d3-bbb0-c027b9b28b6d  ONLINE       0     0     0
        special
          58142789-93e8-4481-bdc0-1eb1a19c8afc  ONLINE       0     0     0

errors: No known data errors

Home/Backpulver                                       11T  7.8T  3.3T  71% /mnt/Home/Backpulver

root@s-1[~]# fio --name=seqwrite --filename=/mnt/Home/Backpulver/test --size=16G --bs=1M --rw=write --ioengine=sync --direct=1 --numjobs=1
seqwrite: (g=0): rw=write, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=sync, iodepth=1
fio-3.33
Starting 1 process
seqwrite: Laying out IO file (1 file / 16384MiB)
Jobs: 1 (f=1): [W(1)][100.0%][w=187MiB/s][w=187 IOPS][eta 00m:00s]
seqwrite: (groupid=0, jobs=1): err= 0: pid=325491: Fri Sep 26 09:09:59 2025
  write: IOPS=189, BW=190MiB/s (199MB/s)(16.0GiB/86457msec); 0 zone resets
    clat (usec): min=97, max=55286, avg=5246.42, stdev=2565.82
     lat (usec): min=107, max=55320, avg=5272.74, stdev=2567.42
    clat percentiles (usec):
     |  1.00th=[  133],  5.00th=[  322], 10.00th=[ 3261], 20.00th=[ 4015],
     | 30.00th=[ 4359], 40.00th=[ 4752], 50.00th=[ 5080], 60.00th=[ 5473],
     | 70.00th=[ 5866], 80.00th=[ 6521], 90.00th=[ 7635], 95.00th=[ 8979],
     | 99.00th=[14615], 99.50th=[16450], 99.90th=[22938], 99.95th=[25560],
     | 99.99th=[53216]
   bw (  KiB/s): min=100352, max=1114112, per=100.00%, avg=194255.69, stdev=117703.89, samples=172
   iops        : min=   98, max= 1088, avg=189.68, stdev=114.95, samples=172
  lat (usec)   : 100=0.02%, 250=4.58%, 500=1.28%, 750=0.70%, 1000=0.52%
  lat (msec)   : 2=1.70%, 4=11.11%, 10=76.97%, 20=2.99%, 50=0.13%
  lat (msec)   : 100=0.01%
  cpu          : usr=0.73%, sys=5.15%, ctx=17461, majf=0, minf=11
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,16384,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=190MiB/s (199MB/s), 190MiB/s-190MiB/s (199MB/s-199MB/s), io=16.0GiB (17.2GB), run=86457-86457msec
root@s-1[~]#

So i have to check why the Diskperformance is so slow.

May this behavior with the stuck could be from the slow performance?

Thanks

backpulver · September 26, 2025, 7:16am

I made to much tests and posted the wrong parts in…

so now are the right in

Topic		Replies	Views
Struggling with SAS speeds TrueNAS General SCALE , Hardware , SAS	52	1331	October 3, 2024
Weird performance using excessive hardware TrueNAS General SCALE , Hardware , Performance	23	474	September 23, 2025
Bad SMB write/read performance with 4 drives in 2x mirror configuration (RAID10) TrueNAS General SCALE , Hardware , SMB , ZFS , pool	129	2456	December 26, 2024
Any experts here, I need highly experienced suggestions I suspect re: performance oddities TrueNAS General Performance , SMB , ZFS , 13_3-CORE	10	310	June 4, 2024
2MB/sec SMB transfer speed to TrueNAS Scale Storage TrueNAS General SCALE , SMB , ZFS	43	1822	August 19, 2024

Sometimes Files stuck while copy

Related topics