Three checks, one morning answer
Rodney framed the problem as three layers that have to hold for a report to be trustworthy. Today each is checked manually, after the fact. The goal is a single proactive status - "all groovy" or "look at this" - on Trevor's phone before he opens the door. We would build from the bottom of the stack outwards, starting where the data is already flowing.
Did the integration jobs run?
Overnight SQL scrapes machine data into the warehouseEach night, integration jobs pull shift data off the SCADA / IFIX machines into the data warehouse. If one locks up, the day's reporting is built on incomplete data.
Did the reports send?
SSRS generates and emails them via the on-site SMTP relaySSRS 2018 builds and schedules the reports, then emails them to managers and the DBA admin mailbox. If the SMTP relay is blacklisted or a subscription fails, people quietly never get their report.
Was the data complete?
An end-of-shift has to be recorded for a report to be wholeOccasionally a mill machine misses its end-of-shift mark, so a report looks fine on the surface but is half-populated. The morning all-clear is the outcome that ties all three together.
We deliver by pain and impact, not strictly top-to-bottom - the quickest, highest-value win (the status email and the deliverability fix) comes first, then the deeper integration and end-of-shift checks. As of the 17 June call, the next horizon is in view too: scoped or sandboxed access to the data so we can build a proper interface over it, not just read the emails.
Priorities, in order
Ranked by impact against effort, not by the layer diagram. Matt framed it as an elevator - get to the first floor (a useful status email) fast, then climb as more data access opens up. Tap any card to expand. Nothing locked - we rank these together.
The morning all-clear
One status email before Trevor walks inThe pain
- Trevor spends ~30 minutes every morning checking by hand whether all the reports came out and the SQL jobs ran.
- Even after checking, a manager often emails hours later: "I didn't get my report."
- When Trevor is away, Rodney inherits the entire manual routine.
What we'd build
- A single daily status email - "all groovy" or "look at these" - waiting before the shift starts.
- Driven by the mailhook (report receipts) plus Trevor's SSRS agent-log query: subscriptions sent, agent finished on time.
- No on-prem access needed for this first floor - it runs off emailed signals.
Fix report deliverability (SPF)
Stop the SMTP relay being blacklistedThe pain
- The data warehouse / Timbersmart reports relay through ITCO's on-site SMTP server, which has previously failed and been blacklisted.
- Every one of this week's test reports fails an SPF check - they only land because DKIM passes. That is a fragile position.
What we'd build
- Add the relay's sending IP (180.235.104.212) to the redstagtimber.co.nz SPF record - a small DNS change.
- Longer term, consider routing report mail through an authenticated send so deliverability stops depending on DKIM alone.
- Raised on the 17 June call - Matt to take it to ITCO. It affects every report off that mail server, so worth doing before more reports route through the mailhook.
The logic tree: due vs not-due
Tell a broken report from one that legitimately didn't runThe pain
- The EWP report only runs when wood is filleted through the kilns - so when that process is not done, the report legitimately never generates. A manager chased it as "missing" after a month.
- It does not show as failed on the day-summary; Rodney had to run a query against the database to see it had not run.
- Separately, four SQL agents were stacked into one sequential job - the first step ran so long it missed the 3am window, silently skipping the next three steps (including shift patterns). Rodney has since split them.
What we'd build
- A documented map of which reports are conditional and on what trigger (the EWP / kiln case is the first), so the monitor can say "not due" instead of crying wolf.
- Surface the database failure flags Trevor can already query, on a schedule, into the mailhook as a daily error feed.
- Feeds both the status email and any later AI - it needs to know what to expect.
End-of-shift gap detection
Catch the half-populated reports before managers doThe pain
- If a busy SCADA box misses recording its end-of-shift, the night-shift data never rolls over - the report runs but is half empty.
- It is the hardest class of fault: everything looks fine on the surface.
What we'd build
- Detect the missing end-of-shift signal and flag it on the morning status, with the reason.
- Later, support the remediation Rodney described - scrape the value off the machine and re-run.
On the radar: local-AI Q&A
Ask the mill data questions - hosted on-prem, nothing to the cloudThe idea
- Let staff ask plain-English questions of the data (e.g. "how many hi-vis jackets did we buy last month?") via an MCP server over SQL views.
- Stores / Workmate data is cloud-comfortable; Timbersmart is financial and must stay on-prem.
Where it sits
- A separate workstream - Rodney is already testing locally with Ollama and Qwen.
- Incredible can bring local-model experience and help present a board-ready proof of concept.
Pain against ease of build
Top-right first: high pain, quick to build. The morning all-clear and the SPF fix are the early wins we can ship from data we already have. End-of-shift detection is high-value but waits on deeper access; the local-AI proof of concept is a separate, longer play.
The team, and how the morning runs
Today the whole morning routine runs through one person. The monitor changes the shape of that - it becomes the first stop, fed by the SQL logs and report receipts, and Trevor becomes the fixer rather than the checker. Toggle to see the shift.
From end-of-shift to the managers' inboxes - and where the monitor plugs in
Shift runs
Mill / SCADAIFIX machines record PLC data - throughput, moisture, grade - per area of the mill.
End of shift
SCADA boxThe machine closes off the shift and rolls over. The occasional miss point.
Integration jobs
Data warehouseOvernight jobs scrape each machine into the warehouse tables.
Reports generate
SSRS 2018Scheduled reports build off the integrated data, early morning.
SMTP send
Relay + mailhookReports email to managers and the DBA mailbox - CC'd to our mailhook. Where blacklisting bites.
Managers' inboxes
Quality & site managersThe quality team open the day with their reports already waiting.
Morning all-clear
The monitorOne status email: everything ran, or here is exactly what to look at.
The people
Trevor Pratt
Database administrator, Red StagOwns the morning check today - 30-odd reports and the SQL jobs, by hand. The monitor's first job is to lift that routine off him so he is the one who fixes, not the one who hunts.
Rodney Mills
IT & systems, Red StagOur main contact and knowledge-holder - sets up the test report, knows the systems and the triggers, and is running the local-AI experiments.
Brenda Fort
Production (ENT) server, Red StagLooks after the production software server. Relevant because it also runs report services - a second place report status lives.
Site & quality managers
Report consumersOpen each shift with their daily reports - downtime, moisture, throughput. They are who feels it when a report is late or wrong.
Paul, Tim & the board
Sign-off & sponsorshipPaul approves data leaving site; Tim is the manager above that. At board level, Marty Verry is the push to "move with the times" - useful air cover for the AI work.
Incredible
Matt & Mitch - build; Aiyana - deliveryBuild the monitor and the proof of concept; Aiyana runs scheduling and capacity.
The systems, and how they fit
Everything is on-prem and Red Stag intends to keep it there. The plan does not move mill data to the cloud - it works from signals coming out: report emails, agent logs, exception flags.
The integration prize: a read-only signal out
The whole approach turns on getting a signal out without the data leaving site - report receipts to a mailhook (already proven), SSRS agent logs by email, and later exception flags. Enough to be proactive, nothing sensitive in motion.
Timbersmart (SQL Server)
Tracks every packet of wood through the mill - the financial heart. Dollars, salaries, the sensitive one.
Stays on-prem; never to the cloudData warehouse / reporting server
Holds the machine integrations (kilns, sonic testers, planers, graders) and runs Workmate plus SSRS. Rodney owns this one.
Source of the integration-job statusENT production server
The production software server, looked after by Brenda Fort - but it also runs report services, so Rodney still watches it for flags.
Second source of report statusSSRS 2018 (Native)
Builds, schedules and emails the reports to managers and DBA admin. Version and mode confirmed by Rodney.
Agent logs tell us what sentOn-site SMTP relay (ITCO)
A simple SMTP server relays the reports out. The component that has been blacklisted before.
Fix SPF; consider authenticated sendGlide mailhook
Every report is CC'd to a Glide mail address, giving us a timestamped record of what was delivered.
The delivery oracle - live nowSCADA / IFIX Historian
Windows 11 machines across the mill recording PLC tags into a local database; the source of end-of-shift.
Where end-of-shift gaps originateMicrosoft 365 / SharePoint
Email and documents already live in the cloud - useful context when weighing what is genuinely "on-prem only".
Already off-site, by the wayLocal model (Ollama + Qwen)
Rodney's on-prem experiment for plain-English Q&A over mill data, so nothing sensitive goes to a hosted model.
Workstream 2 - prove it, then scaleScoped sandbox database
Rather than open the whole database, Red Stag could stand up a sandbox with just the tables or views we need - or anonymised sample data - so sign-off is easy and nothing sensitive leaves.
The likely path to a real interfaceWeek one of live data
The daily test report Rodney set up is already flowing into our mailhook. Eight days in, it is telling us two useful things - the job is rock-steady, and the deliverability problem is real and measurable.
The root cause, found in the headers
Every report leaves RSTSQL01, relays through RSTRDSM2, and goes out via an IP (180.235.104.212) that is not in the redstagtimber.co.nz SPF record - so SPF fails. They survive only because DKIM passes (so DMARC passes). That is almost certainly the historic "blacklisted" behaviour. Adding the relay IP to SPF is a small DNS change that removes the risk at the root.
| Received (mailhook) | Subject | From | SPF | DKIM / DMARC |
|---|---|---|---|---|
| 10 Jun, 16:33 | SMTP Test Report | reports@redstagtimber.co.nz | Fail | pass / pass |
| 11 Jun, 11:15 | SMTP Test Report | reports@redstagtimber.co.nz | Fail | pass / pass |
| 12 Jun, 11:15 | SMTP Test Report | reports@redstagtimber.co.nz | Fail | pass / pass |
| 13 Jun, 11:15 | SMTP Test Report | reports@redstagtimber.co.nz | Fail | pass / pass |
| 14 Jun, 11:15 | SMTP Test Report | reports@redstagtimber.co.nz | Fail | pass / pass |
| 15 Jun, 11:15 | SMTP Test Report | reports@redstagtimber.co.nz | Fail | pass / pass |
| 16 Jun, 11:15 | SMTP Test Report | reports@redstagtimber.co.nz | Fail | pass / pass |
| 17 Jun, 11:15 | SMTP Test Report | reports@redstagtimber.co.nz | Fail | pass / pass |
The consistency is the point: a job this regular means "nothing arrived by 06:45" is a reliable alarm. The mailhook loop - capture, timestamp, verify - is proven. From 17 June it stops being just the test report: Rodney is forwarding the real integration-status, day-summary and error reports into the same hook, so the morning all-clear gets its actual content.
Open questions
A handful of answers sharpen the build. We work through these together - several are sign-off rather than technical.
Getting the data (sign-off)
The morning all-clear
Data quality & logic
Local-AI proof of concept
Working assumptions
These shape the estimates. The first is the big lever - everything scales with how much signal we can get off-site.
The key unknown: how much signal can leave site
We cannot reach the SQL servers directly today - they are on-prem. The whole approach assumes we can get a signal out: report receipts to a mailhook (proven), SSRS agent logs by email, and exception flags. The 17 June call opened a stronger option - a scoped or sandboxed DB connection (one view, or anonymised sample data) - which would let us build a real interface. If that never opens, the emailed signals alone are still enough for the morning all-clear.
SSRS is 2018, Native mode
Confirmed by Rodney across the calls - shapes how we read subscriptions and agent logs.
Reports can be CC'd to our mailhook
Live and working - eight days of clean captures prove the delivery-confirmation loop.
Status & error reports are non-sensitive and forwardable
Rodney confirmed 17 Jun he can forward the integration-status, day-summary, end-of-shift and DBA-admin reports - they show whether things ran, not commercial data.
A useful status email needs no deeper access
The morning all-clear can be built from the mailhook plus Trevor's agent-log query alone.
Forwarding report contents off-site needs sign-off
Logs feel fine; report contents are company data. Paul, possibly Tim, to confirm.
End-of-shift misses leave a detectable trace
Gap detection assumes there is a log or missing row we can reach without live SQL access.
SPF fails on report mail - raised with ITCO
Found in this week's data and flagged on 17 Jun; Matt to action with ITCO via a DNS change to the SPF record.
Local-AI at Timbersmart scale is viable on-prem
Proven in concept by Rodney with Ollama / Qwen; production scale and hardware need verifying.