-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] deepspeed tries to call "hostname -I" which is not a valid flag for hostname. it should be "hostname -i" #6497
Comments
Hi @sirus20x6 - this issue looks to be similar to this one: #5597 Could you share the output of |
here you go!
and I believe that the posix way of doing this is actually
because net-utils which is where the hostname binary is from is sort of an old deprecated package even though a lot of people still have it installed because they have a lot of muscle memory around those tools |
small correction, actually if you just want the first field that posix way of getting loopback is
|
Thanks, @sirus20x6 - we are also looking at switching to just using |
I believe so. Hopefully that will be more cross-platform and resilient |
If you want, you could test with |
I will test as soon as I get home to my machine! |
doesn't install
|
+1 for the replacing the with import socket
master_addr = socket.gethostbyaddr(socket.gethostname())[0] has been working for me on internal systems + import socket
- master_addr = None
if rank == 0:
- hostname_cmd = ["hostname -I"]
- result = subprocess.check_output(hostname_cmd, shell=True)
- master_addr = result.decode('utf-8').split()[0]
+ master_addr = socket.gethostbyaddr(socket.gethostname())[0] also see: #2837 I'd be happy to submit a PR + test further if it would be useful |
BUGFIX for Apple Silicon hostname
BUGFIX for Apple Silicon hostname #6497 --------- Signed-off-by: Fabien Dupont <fdupont@redhat.com> Signed-off-by: Olatunji Ruwase <olruwase@microsoft.com> Signed-off-by: Logan Adams <loadams@microsoft.com> Signed-off-by: inkcherry <mingzhi.liu@intel.com> Signed-off-by: Roman Fitzjalen <romaactor@gmail.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: Fabien Dupont <fabiendupont@fabiendupont.fr> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Liangliang Ma <1906710196@qq.com> Co-authored-by: inkcherry <mingzhi.liu@intel.com>
This should be resolved with this PR: #6990. Let me know if you are still having issues with this. |
BUGFIX for Apple Silicon hostname deepspeedai#6497 --------- Signed-off-by: Fabien Dupont <fdupont@redhat.com> Signed-off-by: Olatunji Ruwase <olruwase@microsoft.com> Signed-off-by: Logan Adams <loadams@microsoft.com> Signed-off-by: inkcherry <mingzhi.liu@intel.com> Signed-off-by: Roman Fitzjalen <romaactor@gmail.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: Fabien Dupont <fabiendupont@fabiendupont.fr> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Liangliang Ma <1906710196@qq.com> Co-authored-by: inkcherry <mingzhi.liu@intel.com> Signed-off-by: gyou2021 <ganmei.you@intel.com>
BUGFIX for Apple Silicon hostname deepspeedai#6497 --------- Signed-off-by: Fabien Dupont <fdupont@redhat.com> Signed-off-by: Olatunji Ruwase <olruwase@microsoft.com> Signed-off-by: Logan Adams <loadams@microsoft.com> Signed-off-by: inkcherry <mingzhi.liu@intel.com> Signed-off-by: Roman Fitzjalen <romaactor@gmail.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: Fabien Dupont <fabiendupont@fabiendupont.fr> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Liangliang Ma <1906710196@qq.com> Co-authored-by: inkcherry <mingzhi.liu@intel.com> Signed-off-by: gyou2021 <ganmei.you@intel.com>
BUGFIX for Apple Silicon hostname deepspeedai#6497 --------- Signed-off-by: Fabien Dupont <fdupont@redhat.com> Signed-off-by: Olatunji Ruwase <olruwase@microsoft.com> Signed-off-by: Logan Adams <loadams@microsoft.com> Signed-off-by: inkcherry <mingzhi.liu@intel.com> Signed-off-by: Roman Fitzjalen <romaactor@gmail.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: Fabien Dupont <fabiendupont@fabiendupont.fr> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Liangliang Ma <1906710196@qq.com> Co-authored-by: inkcherry <mingzhi.liu@intel.com> Signed-off-by: gyou2021 <ganmei.you@intel.com>
BUGFIX for Apple Silicon hostname #6497 --------- Signed-off-by: Fabien Dupont <fdupont@redhat.com> Signed-off-by: Olatunji Ruwase <olruwase@microsoft.com> Signed-off-by: Logan Adams <loadams@microsoft.com> Signed-off-by: inkcherry <mingzhi.liu@intel.com> Signed-off-by: Roman Fitzjalen <romaactor@gmail.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: Fabien Dupont <fabiendupont@fabiendupont.fr> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Liangliang Ma <1906710196@qq.com> Co-authored-by: inkcherry <mingzhi.liu@intel.com> Signed-off-by: Masahiro Tanaka <mtanaka@microsoft.com>
BUGFIX for Apple Silicon hostname deepspeedai#6497 --------- Signed-off-by: Fabien Dupont <fdupont@redhat.com> Signed-off-by: Olatunji Ruwase <olruwase@microsoft.com> Signed-off-by: Logan Adams <loadams@microsoft.com> Signed-off-by: inkcherry <mingzhi.liu@intel.com> Signed-off-by: Roman Fitzjalen <romaactor@gmail.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: Fabien Dupont <fabiendupont@fabiendupont.fr> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Liangliang Ma <1906710196@qq.com> Co-authored-by: inkcherry <mingzhi.liu@intel.com> Signed-off-by: yisheng <yi.sheng@intel.com>
Describe the bug
A clear and concise description of what the bug is.
deepspeed tries to call "hostname -I" which is not a valid flag for hostname. it should be "hostname -i"
To Reproduce
Steps to reproduce the behavior:
Expected behavior
A clear and concise description of what you expected to happen.
ds_report output
Please run
ds_report
to give us details about your setup.Screenshots
If applicable, add screenshots to help explain your problem.
System info (please complete the following information):
Launcher context
Are you launching your experiment with the
deepspeed
launcher, MPI, or something else?Docker context
Are you using a specific docker image that you can share?
Additional context
Add any other context about the problem here.
the offending code:
The text was updated successfully, but these errors were encountered: