What could go wrong?

Mon Jun 29 11:39:26 EDT 2020

On 6/29/20 11:14 AM, Howard F. Cunningham wrote:
>
<snip/>

> */[HC:>] When you say clock jitter I think that you are saying that
> the clock drifts, probably slower, over time.  This tells me that the
> system is busy and missing the clock timings.  When Windows starts, it
> checks the BIOS clock for its time.  While running Windows does not
> check the BIOS clock but rather updates a counter which is used for
> the time.  I wonder if telling Windows to use a ntp server would solve
> this problem./*
>
Not in this case -- the clock itself is fine and doesn't drift; at least
not enough to be seen on screen.

The clock jitter I'm talking about is fine-grained and has to do with
missing samples in the data stream. It appears to be a scheduling problem.

This even occurs if the CPUs allocated in the VM are given "realtime"
priority; and doesn't seem to depend on load because the VM is
significantly over-provisioned.

<snip/>
>
> */[HC:>] I never said that Windows was more stable than Linux./*
>
No, you didn't, and I did not mean to imply that, sorry.

<snip/>
>
> */[HC:>] My clients do not routinely reboot their computers.  What
> usually happens is that the systems are rebooted as part of the
> monthly patch process.  That most likely prevents us from seeing the
> need to reboot random Windows 10./*
>
I suspect that they are not running applications that require
long-running critical timing tasks

> Trying to support long-running timing critical software on Windows is
> problematic and requires special care -- that's all I'm saying. I
> didn't make it up.
>
> */[HC:>]  I never said that Windows never needs rebooting and I did
> not say that there are never problems on some Windows systems.  The
> problem has very muchly improved today.  What I should have said is
> that Windows should not need to be rebooted daily or even weekly. 
> Where we see this problem we are able to track the need to reboot down
> to a specific program that causing the problem./*
>
That seems reasonable in some cases; but often enough for my experiences
that's not a viable approach -- I know there is a give-and-take in play
here that may not match up.

>      A thought on the unlimited amount of data..  You could try
>     running a scheduled task to delete the data.  Keep in mind that I
>     do not know if this data is needed or what happens if the data is
>     deleted while wispr is running.  If wispr needs that data while
>     running, you could try using a scheduled task to shutdown wispr,
>     delete the files, and restart wispr (assuming that wispr does not
>     require any keyboard interaction to start)
>
> I don't think it's required by wspr -- wspr is simply generating the
> data as some kind of log... given the unmitigatable bloat of win* 40G
> of disk space is not enough after a few days and wspr will generate
> enough data to fill up the remaining space; but since I have to reboot
> it in order to avoid the timing problems it's a) not a problem to
> delete the data (there is indeed a menu option for it in the program)
> and b) a scheduled task would not run reliably because of the
> necessary reboot.
>
> */[HC:>] We use a tool that allows to schedule things like deleting
> files like this.  We have also used it to reset the clock on a couple
> of systems that has clock drift.  That was years ago./*
>
Clock drift is not a problem in this case, but rather clock jitter and
scheduling jitter. As for deleting the files -- I could probably
schedule something; but as I pointed out it's more straight forward to
simply use the wspr software to delete the files (rather than reverse
engineering where they live and being concerned with whether wspr knows
I've deleted them).

> It's not as terrible as it seems -- one should check in on these kinds
> of systems periodically as a matter of course anyway; so handling the
> required reboot and reset as part of that maintenance is easy enough
> to fit in.
>
> */[HC:>] We do monitor all of our client systems to hopefully catch
> problems before the client sees them./*
>
I know you do :-)

I'm not trying to pick a fight -- just reporting real-world experiences
and the mitigations I'm using in response.

There are other ways to slice this too -- as was pointed out: I _could_
run the whole thing on dedicated hardware and might get different
results -- BUT that's costly, complex, and inefficient for other
reasons. I have plenty of hardware already and virtualizing things makes
it easy to manage and maintain. It's important that things work well in
a fluid virtualized environment as that is the direction computing is going.

Everything else I'm doing works in this environment; I shouldn't have to
dedicate hardware to every specific case -- especially when the
performance requirements are not that significant (wspr is not a high
bandwidth operation).

I may be pushing the envelope a bit (I tend to do that) but in this case
not really ... it should "just work" and it doesn't ... So I'm pointing
out where the edges are.

Best,

_M

> _M
>
>  
>
> -- 
> kf4hcw
> Pete McNeil
> lifeatwarp9.com/kf4hcw
>
>
>
>

-- 
kf4hcw
Pete McNeil
lifeatwarp9.com/kf4hcw

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.amrad.org/pipermail/tacos/attachments/20200629/ec59a417/attachment.html>