author: Jonathan Stuckey
When migrating content between platforms addressing common blockers in Excel are usually the hidden issues which don't get resolved. In the first article on migrating Excel files, I went into background and understanding types of issues specific to Excel. This article addresses a couple of the most frequently encountered issues, and how they are fixed.
The next bit is exposition, so if you want to jump to the "how to" bit - go here and I have added section specifically on the manual steps for users.
Why Excel issues are challenging
When most organisations complain they don't have time, or the resources, to take on addressing these issues its usually due to lack of understanding behind the problem, how to address it, or through poorly defined business value in fixing it.
Most often its because for years the organisation has had a 'make-do and mend' approach to investment - and now someone needs to add-up all the work-a-rounds and duct-taped apps which are hidden (embedded) in your spreadsheets. Secondly, when desktop tools like Access were phased-out, people sought out the modelling and presentation tools which would work over structured data from other systems. Enter Excel.
This evolution of business IT tools are from the people, by the people, for the people - and not the organisation. Its a house-of-cards that develops with:
lack of education and skill in the workforce using sophisticated Excel tools, because training has become devalued in corporate world
lack of documentation because tools and logic were developed by subject experts, not support or IT Engineers
no corporate memory once the creator of tool(s) moves on (see above)
nagging fear, uncertainty and doubt that key processes or activities will stop/fail/be corrupted if 'tool' stops working.
internal culture does not lend itself to highlighting the risk
Net result: no willingness or funding to address the issues.
Tackling the issues
The range of issues were categorised in the previous article, but because of breadth of variety I'm only addressing issues: commonly found, easily fixed and often ignored problems - i.e. ones related to:
File or site linking
Embedded data-connections
Other sites and people do a far better job of addressing replacing Macros, or re-developing business logic in more modern languages for moving between VBE, VS .Net.. to Python.
How to address data-source linking or connection issues
Really you've got four options:
assess, buy and utilise available tools for scanning and bulk-updating as migrate
attempt to script and code changes required (not advised)
manually update all critical files
do nothing, hope that users largely fail to notice or complain (arguably least impact)
Tools which address migration issues
There have been many tools developed to address the resource intensive tasks around fixing document links, macros and integration, some going back to the end of the 1990's. One's like SharePoint Essentials Toolkit, ReplaceMagic, LinkTek... have been around for ages. They are tried and tested and actually cost relatively little to acquire. Running them is another matter.
These are designed for sizing your problem, and then operating at scale for migrating your documents.
If you look at the basic review of tools (just using the ones above) then:
Tool Name | Features | Pricing (USD) | User Reviews |
Cognillo (SPET) | Fixes links within Excel, PDF, MS Word documents; integrates with SharePoint; automated link correction; ability to maintain link integrity for signed documents. | Approximately $1,000 per license annually (highly variable based on user needs) | Effective for managing broken links. Good reporting functionality. Simple web UI. Relatively expensive compared to other solutions. |
ReplaceMagic | Batch processing, file type support (including Excel), preview function before finalizing changes, network access for scans, and compatibility with various Excel versions. | $399.00 on-off cost. | Effective and easy to use for fixing broken links. Performs bulk search/replace operations across various document types. Demonstrable time-saving capabilities during migrations and data management tasks. Old UI experience |
LinkTek | Automated scans and reporting pre-job; Automated link fixing during workflow. Supports various file types inc. Office formats. Protects against broken links during data migration | on-request | Excellent support Effective in fixing broken links during migrations. Software simplifies substantial data correction projects. Expensive model based on volumes and consulting wrap-a-round |
The Pros and Cons are relatively straight forward:
Pros | Cons |
---|---|
Ready to use | Less customisation |
Specialised features | May require training |
Established support | Will need clear requirements |
Relatively low cost of acquisition | Procurement process* |
Faster time to implementation | |
Cover more scenarios besides these |
|
Regular support | updates | |
Scalability - designed for automation |
I could go on, but in the case of addressing if you should bother with a tool I would suggest buy one. Developing one is a vanity project.
The gotcha, (knew it), is they require solid, repeatable (consistent) rules for the automation to work effectively and update embedded URLs/data-links etc. ...but this is also the case if you script or code. So, no surprises really.
Also tools require you have time to setup, test and validate your use-cases, procedures, identifying gaps etc. Which means enabling the people running them and not overloading them before put them into Production / Project use.
Scripting / Code fix-up for files
My first question before getting a developer for anything, is always: Has this been done before?
If you can find an example where it has, there's probably software, a tool, samples or patterns which are available to address the problem - which reduces your risk, time and effort.
So if you are thinking issues from moving your Excel files are unique - think again. People have been moving these documents around for nearly 40 years. Just in case you think you are different below is simplified set of pros and cons of purchased tool vs. Script/Code - solely based on fixing broken links & data connectors:
Pros | Cons |
Automated solution saves time compared to manual link fixing | Initial development time and cost to create the script/tool |
Can process large numbers of spreadsheets efficiently | Requires programming skills to develop and maintain |
Customizable to specific organisational needs | May need ongoing updates as Excel versions change |
Reduces human error in link modification | Potential for errors if script logic has flaws |
Improves data integrity and reliability | Security considerations for accessing/modifying files |
Enables standardised link updating process | May not handle all edge cases or complex link scenarios |
Can generate reports on broken/fixed links | Learning curve for users to operate the tool |
Potential for integration with other systems | Compatibility issues across different Excel versions |
Scalable for enterprise-wide deployment | Ongoing maintenance and support required |
Can incorporate data validation and error checking | May require IT approval/involvement to implement |
The Pros on this list are ones for a purchased software tool - the Cons indicate the inherent man-time, effort and risk from custom development.
Generally its very expensive to do this yourself, when consider most of the tools cost is between:
$400 (USD) and $10,000 (USD),
or to put it another way the equivalent of
0.5 - 9.25 working days for a intermediate consultant or experienced developer
I don't know about you, but I don't know of any internal scripting or code development which needs to do location, format and access checks, process validation and provide logging, and options for scheduling or execution - for data | document changes - that takes less than 5-days to complete, validate and test.
Recommendation: Just buy a tool. Save your developer for real challenges.
Manually update file to fix connection issues.
This approach can be captured and put in knowledgebase as a helpdesk standard-operating-procedure, or used as training guide for power users, in the event you don't try to fix your migration problem.
Little known fact #1: Excel embeds file and SharePoint location references using a fully qualified path, but it can be made to read local or relative paths.
What that means is you can fix your issues running spreadsheet from SP easily, by updating the embedded data-connection data-source reference. Or, open file and replace the old path, with a new one.
For example to change old fully-qualified path to a new one:
open Excel file
got to Data menu (1)
Click on Get Data > Data source settings (2)
Select the original file / web / connection setting
click on Change source (3)
update File path to new location of the linked file (4)
click OK, then Close
save File
You do exactly the same thing with a SharePoint URL reference, although with internet weblink you can only updated permissions or completely remove & re-add the link.
So what that looks like is changing the source by replacing a file-path (or UNC reference) e.g. \\Finance\Budgets\Planning-2024.xlxs (or a network drive pre-fix like S:\)
and setting that to the new SharePoint / OneDrive location using:
a new fully-qualified path name
e.g. https://<org_name>-my.sharepoint.com/my?id=%2Fpersonal%2Fjonathan_company_com%2FDocuments
e.g. https://<org_name>.sharepoint.com/sites/finance/Shared%20Documents
or
a new relative path name
e.g. <ROOTFOLDER>/sites/finance/Shared%20Documents/Reports/</ROOTFOLDER>
Little known fact #2: Data from SharePoint is captured as either Query or Connection file (.iqy), but both can be updated manually through Power Query Editor to correct the path reference(s) required.
Firstly when open files with these, you'll get the usual yellow-bar Security Warning prompt:
Click 'Enable Content' button, or nothing will happen
Confirm you intended to make the Document Trusted, or you wont be editing anything:
got to Data menu
Click on 'Get Data'
Click on 'launch Power Query Editor'
From here you can click in to (2) for Data Source Settings, and update the references by:
Select external data (or source location) reference
Click 'Change Source'
Noting you will almost certainly have to edit the Authentication connection permissions, to take account of the new location of either (or both) file and data-source.
While the simplest updates, for a single file or new data-source location, can be done with cut-n-paste of path (Basic), there's a good chance you are going to have to provide a site and library or folder path for multiple data-sources (files) to be updated - doing that we can use (Advanced) structure to create path for repeated use.
Note: this is the sort of approach you will need to consider for bulk updates when migrating content using a Tool.
Click OK
Click Close
Then test the connection in the spreadsheet.
If working, you are good to save the file and move on to the next issue.
Little known fact #3: Excel embeds data connections using a local connector file, or data-source definition.
There's a long history of integrated data connectors, from: OLEDB, ODBC, OData, HDFS and on... all of which have both embedded set of data-connection settings (data-source reference/path, interface type, schema reference or query details, connection authentication settings etc).
All these different types of connection fundamentally doing the same thing for different platforms and data-store formats, all held in a file e.g. .odc (Office Data Connection) file or equivalent, and all on the local machine where the Excel spreadsheet is created.
This means you can update the references using the native Excel wizard
got to Data menu
click 'Queries & Connections' to open panel
If there are multiple files, or external data-sources references you will see them listed in the column, or click on Connections to see connection-files used
Generic SharePoint list/library connections will usually be listed as 'query (#)' - with number representing which file it is. To access the embedded details, and ultimately update them accordingly:
highlight the specific Connection file in panel
double-click on Query file reference to open dialogue
click on 'Definition' tab
Most people don't realise the associated Connection files are created by default in Downloads folder on the machine where the Excel file was created, and these are often lost when people change machines, they are reimaged, or replaced.
If you have the specific connection file (or know the location), you can update the "Connection file" string. Use Browse button to navigate to correct (new) location and select file.
If the connection file has been lost, or is no longer accessible, you:
untick the 'Always use connection file' checkbox
update 'Command text' strings to reflect the new location
Assuming you have all the relevant details, once you've tested the new connection string, you can re-generate a connection file using the 'Export connection file' button at the bottom.
Note: an exported connection file is created as .odc (Office database connections) file, which by default is saved to the desktop 'Documents' folder.
Constraints and limitations
Lets be honest, this has covered the easy things. The stuff that you should all know if you use Excel. Where migration issues get gnarly is when you have Macros and Embedded program logic (code).
Really what you need for code though is to understand the level of risk and investment necessary to move your code to something that doesn't rely on "Bill's" knowledge (see previous article) to make it work when you move.
That means documenting your processes properly so you can do a proper risk analysis and cost/benefit assessment to fig out if you fix, or forget.
I have outlined manual fixes for simple connections in a spreadsheet, but when migrating a lot of content (100ks to millions of items) this is not practical. What you need to do then is a combination of analysis, triage, bulk-changes, exception management, historic/archive recovery and key-item updates. For that, you are back to tools and experienced migration consultants.
End-of-life support
It is worth noting Microsoft is making moves to deprecate Power Query Editor UI experience in favour of Copilot and modern UI configuration forms.
The approach for connectors, the underlying details about settings, connection strings, data-set references and authentication wont disappear any time soon but the means to manage it easily is looking a bit precarious on the roadmap.
Reference
Key things to learn about if you are doing a migration involving Excel files:
When reviewing tools for bulk-migration, you will need to know about workflow and pattern-matching.
About the author: Jonathan Stuckey
Comments